Re: BK Client connection loss with ZK

2016-06-07 Thread Uma gangumalla
Good point, Venkateswara Rao.

Some time ago, we worked on this scenarios. Here is a patch
available. HDFS-3562
Here we just tried to keep at application side. But as a long term solution
this could be placed at BK side as utility module? So that all applications
can benefit.


Note: As I remember RetryableZookeeper idea was taken from HBase.

Regards,
Uma

On Mon, Jun 6, 2016 at 9:42 AM, Venkateswara Rao Jujjuri 
wrote:

> If a bookie looses connection with ZK, connection gets reestablished and
> life goes on. How are we handling it on the client case? Should we retry at
> library level?
> or leave it up to the application? Any discussion/thoughts on this?
>
> --
> Jvrao
> ---
> First they ignore you, then they laugh at you, then they fight you, then
> you win. - Mahatma Gandhi
>


Re: Improve Write performance with Relax durability.

2016-06-07 Thread Sijie Guo
I think that's a fair consideration. However I am thinking if we allow
non-durable ledger, that means 1) application needs to handle the missing
entries; 2) the re-replication should handle non-durable ledger by ignoring
the non-existing entries if they are missing.

But Let's see how Jia is proposing.

- Sijie

On Fri, Jun 3, 2016 at 8:57 AM, Venkateswara Rao Jujjuri 
wrote:

> @sijie let me expand what I mean by " this changes something fundamental "
>
> Everything starts that we are not persisting. Also I share lot of the
> points raised by @Matteo.
>
> - In theory, we could loose all copies of EntryId X but persist EntryId
> X+Y.  How does reads,replication, consistency cope up with it?
> - We could advance LAC, but loose last last set of entries. What do we do?
> do we adjust LAC? at what boundaries?
> - One of the core principles of LOG is, if entry X is there , all the
> entries up until X are available too, with this we may need to deal with
>sparse / missing entries.
>
> I believe this is more of a direction towards making BooKKeeper in-memory
> log, but I am afraid it is more of a core change.
>
> Thanks,
> JV
>
> On Fri, Jun 3, 2016 at 12:05 AM, Matteo Merli  wrote:
>
>> I was interested in trying something in this area, but never actually got
>> to do it.
>>
>> A few random notes:
>>
>> 1. My suspicion, with no backing data at this point, is that simply
>> skipping the fsync
>> for "non-durable" ledgers might not give a big improvement, just a bit
>> less latency
>> for non-fsynced writes but roughly the same throughput. Imagine a
>> bookie
>> receiving writes for 2 ledgers, 1 durable and the other non-durable.
>> Since the entries are appended to the journal as they come in, the
>> fsync() for the
>> durable ledger write will also carry on the data for the previous
>> non-durable ledger
>> write, causing more IOPS if that was spanning a different disk block.
>> Given that the bookie throughput is typically limited by the IOPS
>> capacity of the
>> journal device, having non-durable write might help that much.
>>
>> 2.  The other options I was thinking were :
>>   - Do not append the non-durable entries to journal (redundancy is
>> anyway given by
>> writing to multiple bookies). In this case though, a single bookie
>> could loose more
>> entries depending on flushTime, and also could loose entries even
>> in case of
>> process crash, not just kernel-panic or power-outage.
>>
>> - Use a separate journal for non-durable writes which will not be
>> fsynced()
>>
>> - Configure the durability at the bookie level and then use
>> placement/isolation policy to choose the
>>   appropriate set of bookies for a non-durable ledger.
>>
>> 3. How do bookie replication will operate when getting read-errors?
>>
>> Matteo
>>
>> On Thu, Jun 2, 2016 at 11:09 PM Sijie Guo  wrote:
>>
>> > I think if a ledger is configured to be non-durable, it is kind of
>> > application's responsibility to tolerant the data loss.
>> > So I don't think it actually will have to change any in the bookkeeper
>> > client side.
>> >
>> > - Sijie
>> >
>> > On Thu, Jun 2, 2016 at 7:29 AM, Venkateswara Rao Jujjuri <
>> > jujj...@gmail.com>
>> > wrote:
>> >
>> > > I agree that we must make this ledger property not perEntry write
>> > property.
>> > >
>> > > But, biggest doubt in my mind is - this changes something fundamental.
>> > LAC.
>> > > Are we allowing sparse ledger? in failure scenario? Handling read side
>> > may
>> > > become more complex.
>> > >
>> > > On Thu, Jun 2, 2016 at 12:19 AM, Sijie Guo 
>> wrote:
>> > >
>> > >> This seems interesting to me. However, it might be safe to start
>> with a
>> > >> flag configured per ledger, rather than per entry. Also, it would be
>> > good
>> > >> to hear the opinions from other people. JV, Matteo? (If I remembered
>> > >> correctly, Matteo mentioned that Yahoo might be working on similar
>> > thing)
>> > >>
>> > >> +1 for creating a BOOKKEEPER jira to track this.
>> > >>
>> > >> - Sijie
>> > >>
>> > >> On Wed, Jun 1, 2016 at 6:37 PM, Jia Zhai 
>> wrote:
>> > >>
>> > >> > + distributedlog-user
>> > >> > For more input and comments. :)
>> > >> >
>> > >> > Thanks.
>> > >> >
>> > >> > On Thu, Jun 2, 2016 at 9:34 AM, Jia Zhai 
>> wrote:
>> > >> >
>> > >> >> Hello all,
>> > >> >>
>> > >> >> I am wondering do you guys have any plans on supporting relax
>> > >> durability.
>> > >> >> Is it a good feature to have in bookkeeper (also for
>> DistributedLog)?
>> > >> >>
>> > >> >> I am thinking adding a new flag to bookkeeper#addEntry(...,
>> Boolean
>> > >> >> sync). So the application can control whether to sync or not for
>> > >> individual
>> > >> >> entries.
>> > >> >>
>> > >> >> - On the write protocol, adding a flag to indicate whether this
>> write
>> > >> >> should sync to disk or not.
>> > >> >> - 

Build failed in Jenkins: bookkeeper-master #1402

2016-06-07 Thread Apache Jenkins Server
See 

--
[...truncated 380 lines...]
Tests run: 0, Failures: 0, Errors: 0, Skipped: 0

[INFO] 
[INFO] --- maven-jar-plugin:2.3.1:jar (default-jar) @ bookkeeper-stats-api ---
[INFO] Building jar: 

[INFO] 
[INFO] >>> findbugs-maven-plugin:2.5.2:check (default-cli) @ 
bookkeeper-stats-api >>>
[INFO] 
[INFO] --- findbugs-maven-plugin:2.5.2:findbugs (findbugs) @ 
bookkeeper-stats-api ---
[INFO] Fork Value is true
[INFO] Done FindBugs Analysis
[INFO] 
[INFO] <<< findbugs-maven-plugin:2.5.2:check (default-cli) @ 
bookkeeper-stats-api <<<
[INFO] 
[INFO] --- findbugs-maven-plugin:2.5.2:check (default-cli) @ 
bookkeeper-stats-api ---
[INFO] BugInstance size is 0
[INFO] Error size is 0
[INFO] No errors/warnings found
[INFO] 
[INFO] 
[INFO] Building bookkeeper-server 4.5.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ bookkeeper-server ---
[INFO] Deleting 
 
(includes = [dependency-reduced-pom.xml], excludes = [])
[INFO] 
[INFO] --- apache-rat-plugin:0.7:check (default-cli) @ bookkeeper-server ---
[INFO] Exclude: **/DataFormats.java
[INFO] Exclude: **/BookkeeperProtocol.java
[INFO] Exclude: **/TestDataFormats.java
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.1:process (default) @ 
bookkeeper-server ---
[INFO] 
[INFO] --- maven-resources-plugin:2.4.3:resources (default-resources) @ 
bookkeeper-server ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 3 resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.0:compile (default-compile) @ 
bookkeeper-server ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 208 source files to 

[INFO] 
[INFO] --- maven-resources-plugin:2.4.3:testResources (default-testResources) @ 
bookkeeper-server ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.0:testCompile (default-testCompile) @ 
bookkeeper-server ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 101 source files to 

[INFO] 
[INFO] --- maven-surefire-plugin:2.9:test (default-test) @ bookkeeper-server ---
[INFO] Surefire report directory: 


---
 T E S T S
---

---
 T E S T S
---
Running org.apache.bookkeeper.meta.TestZkVersion
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.076 sec
Running org.apache.bookkeeper.meta.LedgerManagerIteratorTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.722 sec
Running org.apache.bookkeeper.meta.TestZkLedgerIdGenerator
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.837 sec
Running org.apache.bookkeeper.meta.LedgerLayoutTest
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.69 sec
Running org.apache.bookkeeper.meta.TestLedgerManager
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.426 sec
Running org.apache.bookkeeper.meta.GcLedgersTest
Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.49 sec
Running org.apache.bookkeeper.bookie.LedgerCacheTest
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.004 sec
Running org.apache.bookkeeper.bookie.BookieThreadTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.167 sec
Running org.apache.bookkeeper.bookie.TestSyncThread
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.967 sec
Running org.apache.bookkeeper.bookie.IndexCorruptionTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 64.513 sec
Running org.apache.bookkeeper.bookie.CompactionTest
Tests run: 20, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 99.939 sec
Running org.apache.bookkeeper.bookie.UpdateCookieCmdTest
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.135 sec
Running org.apache.bookkeeper.bookie.BookieInitializationTest
Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 20.531 sec
Running org.apache.bookkeeper.bookie.BookieJournalTest
Tests run: 13, Failures: 0, Errors: 0, Skipped: 0, Time