[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16291882#comment-16291882
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2789:
---

Github user asdf2014 commented on the issue:

https://github.com/apache/zookeeper/pull/262
  
Hi, @breed . Thanks for your comment. You are right, we should keep the 
enough epoch value to avoid meet the epoch overflow. So i offer a better 
solution is 24-bit epoch in second comment. Even if the frequency of leader 
election is once by every single minutes, we will not experience the epoch 
overflow until **1915.2** years later.


![image](https://user-images.githubusercontent.com/8108788/34022152-9f04832c-e178-11e7-9bf3-c1b047613dae.png)



> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
>Assignee: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16291100#comment-16291100
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2789:
---

Github user breed commented on the issue:

https://github.com/apache/zookeeper/pull/262
  
i think it would be much better to extend ZOOKEEPER-1277 to more 
transparently do the rollover without a full leader election.

the main issue i have with shortening the epoch size is that once the epoch 
hits the maximum value the ensemble is stuck, nothing can proceed, so we really 
need to keep the epoch size big enough that we would never hit that condition. 
i don't think a 16-bit epoch satisfies that requirement.


> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
>Assignee: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16290405#comment-16290405
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2789:
---

Github user asdf2014 commented on the issue:

https://github.com/apache/zookeeper/pull/262
  
Hi, @phunt . Indeed, the `FastLeaderElection` algorithm is very efficient. 
Most of the leader election situation would finished in hundreds milliseconds. 
However, some real-time stream frameworks suck as Apache Kafka and Apache Storm 
etc, could make lots of pressures into Zookeeper cluster when they carry on too 
many business data or processing logic. So maybe, the leader election will be 
triggered very frequently and the process becomes time consuming.


> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
>Assignee: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16290360#comment-16290360
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2789:
---

Github user phunt commented on the issue:

https://github.com/apache/zookeeper/pull/262
  
Ok, thanks for the update. fwiw restarting taking a few minutes is going to 
be an issue regardless, no? Any regular type issue, such as a temporary network 
outage, could cause the quorum to be lost and leader election triggered.


> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
>Assignee: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-12-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288015#comment-16288015
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2789:
---

Github user phunt commented on the issue:

https://github.com/apache/zookeeper/pull/262
  
Are you seeing this behavior with ZOOKEEPER-1277 applied? If so it's a bug 
in that change, because after that's applied the leader should shutdown as we 
approach the rollover.

It would be nice to address this by changing the zxid semantics, but I 
don't believe that's a great idea. Instead I would rather see us address any 
shortcoming in my original fix (1277)

fwiw - what I have seen people do in this situation is to monitor the zxid 
and when it gets close (say within 10%) of the rollover they have an automated 
script which restarts the leader, which forces a re-election. However 1277 
should be doing this for you.

Given you are seeing this issue perhaps you can help with resolving any 
bugs in 1277? thanks!




> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
>Assignee: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-12-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282892#comment-16282892
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2789:
---

Github user asdf2014 commented on the issue:

https://github.com/apache/zookeeper/pull/262
  
@JarenGlover It's a good idea, but not the best solution. Still we can 
using the `restart` operation to solve this problem without any changes to 
Zookeeper for now. (BTW, you are welcome to get more details of this problem in 
my 
[blog](https://yuzhouwan.com/posts/31915#%e6%9e%b6%e6%9e%84%e8%ae%be%e8%ae%a1) 
:-)


> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
>Assignee: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-12-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282229#comment-16282229
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2789:
---

Github user JarenGlover commented on the issue:

https://github.com/apache/zookeeper/pull/262
  
@yunfan123 @asdf2014 i have seen this issue a twice over a month period.

is there anything one can do to prevent this from happening? maybe allowing 
for leader restarts at "off peak hours" weekly?(yuck i know)

it sound like if we can move forward with this if we move to 48 bits low 
correct? 

note version: `3.4.10`


> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
>Assignee: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16051530#comment-16051530
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2789:
---

Github user yunfan123 commented on the issue:

https://github.com/apache/zookeeper/pull/262
  
Hi, @asdf2014 
In most cases, I don't think the epoch can overflow 16-bit.
In general, zookeeper leader election is very rare, and it may take several 
seconds even several minutes to finish leader election.
And zookeeper is totally unavailable during leader election.
If the zookeeper that you use can overflow 16-bits, it turns out the 
zookeeper you used is totally unreliable.
Finally, compatible with old version is really important.
If not compatible with old versions, I must restart all my zookeeper nodes. 
All of nodes need reload snapshot and log from disk, it will cost a lot of 
time.
I believe this upgrade process is unacceptable by most zookeeper users.



> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
>Assignee: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-06-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050155#comment-16050155
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2789:
---

Github user asdf2014 commented on the issue:

https://github.com/apache/zookeeper/pull/262
  
Hi, @yunfan123 . Thank you for your suggestion. As you said in the opinion, 
so that it can guarantee a smooth upgrade. However, if the 16-bit `epoch` 
overflow rather than the `counter` overflow, it will make Zookeeper cannot keep 
provide services by re-election anymore. So, i thought we should keep enough 
space for `epoch`. What you think?


> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
>Assignee: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-06-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050139#comment-16050139
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2789:
---

Github user yunfan123 commented on the issue:

https://github.com/apache/zookeeper/pull/262
  
Hi, I think 48 bits low is better for large throughput zk cluster.
Another benefits is when use 48 bits low we assuming the epoch low than 
(1<<16), so we can 16 bits high to judge whether it is old version or new 
version.
So use 48 bits low we can make the upgrade progress smoothly


> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
>Assignee: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036810#comment-16036810
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2789:
---

Github user asdf2014 commented on the issue:

https://github.com/apache/zookeeper/pull/262
  
@maoling You are welcome. Already changed it into following 
[code](https://github.com/apache/zookeeper/pull/262/files#diff-f4e58b67b9a4084420cb9b58398a953cR125).
 Then, i think it can still guarantee its idempotency.
```java
long count = ZxidUtils.getCounterFromZxid(zxid);
long epoch = ZxidUtils.getEpochFromZxid(zxid);
```


> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
>Assignee: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-06-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036796#comment-16036796
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2789:
---

Github user maoling commented on the issue:

https://github.com/apache/zookeeper/pull/262
  
Hi, @asdf2014 .Thanks for your explanation! But I still have some 
confusions about the question one:
look at code like this :
```
int epoch = (int)Long.rotateRight(zxid, 32);// >> 32;
long count = zxid & 0xffL;
```
it all depends on that **zxid** can not be altered(no write operation when 
**zxid** has generated at the first time) in the multithread 
situation,otherwise epoch and count isn't idempotent.should **zxid** be 
decorated by **final**?


> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
>Assignee: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-05-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027107#comment-16027107
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2789:
---

Github user asdf2014 commented on the issue:

https://github.com/apache/zookeeper/pull/262
  
Hi, @maoling. Thanks for your discussion. Maybe due to my description is 
problematic, so make you confused.

1. I am worry about if the lower 8 bits of the upper 32 bits are divided 
into the low 32 bits of the entire `long` and become 40 bits low, there may be 
a concurrent problem. Actually, it shouldn't be worried, all operation about 
`ZXID` is bit operation rather than `=` assignment operation. So, it cann't be 
a concurrent problem in `JVM` level.

2. Yep, it is. Especially, if it is `1k/s` ops, then as long as $2^{32} / 
(86400 * 1000) \approx 49.7$ days `ZXID` will exhausted. And more terrible 
situation will make the `re-election` process comes early. At the same time, 
the "re-election" process could take half a minute. And it will be cannot 
acceptable.

3. As so far, it will throw a `XidRolloverException` to force `re-election` 
process and reset the `counter` to zero.


> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
>Assignee: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022741#comment-16022741
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2789:
---

Github user maoling commented on the issue:

https://github.com/apache/zookeeper/pull/262
  
A good and interesting question!


> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-05-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022547#comment-16022547
 ] 

Hadoop QA commented on ZOOKEEPER-2789:
--

-1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 15 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/745//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/745//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/745//console

This message is automatically generated.

> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-05-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022510#comment-16022510
 ] 

Hadoop QA commented on ZOOKEEPER-2789:
--

-1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 15 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/744//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/744//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/744//console

This message is automatically generated.

> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022448#comment-16022448
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2789:
---

Github user asdf2014 commented on the issue:

https://github.com/apache/zookeeper/pull/262
  
Due to this [jvm 
bug](http://bugs.java.com/bugdatabase/view_bug.do?bug_id=7177813), JDK7 cannot 
recognition `static import`... I will use fully qualified name replace of it.
```bash
[javac] 
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/src/contrib/loggraph/src/java/org/apache/zookeeper/graph/JsonGenerator.java:129:
 error: cannot find symbol
[javac] long epoch = getEpochFromZxid(zxid);
```


> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022407#comment-16022407
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2789:
---

Github user asdf2014 commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/262#discussion_r118176130
  
--- Diff: 
src/contrib/loggraph/src/java/org/apache/zookeeper/graph/JsonGenerator.java ---
@@ -121,8 +121,8 @@ public JsonGenerator(LogIterator iter) {
} else if ((m = newElectionP.matcher(e.getEntry())).find()) {
Iterator iterator = servers.iterator();
long zxid = Long.valueOf(m.group(2));
-   int count = (int)zxid;// & 0xL;
-   int epoch = (int)Long.rotateRight(zxid, 32);// >> 32;
+   long count = zxid & 0xffL;
+   int epoch = (int)Long.rotateRight(zxid, 40);// >> 40;
--- End diff --

Already unify all code those processing `ZXID` into using `ZixdUtils`.


> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022356#comment-16022356
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2789:
---

Github user asdf2014 commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/262#discussion_r118168282
  
--- Diff: 
src/contrib/loggraph/src/java/org/apache/zookeeper/graph/JsonGenerator.java ---
@@ -121,8 +121,8 @@ public JsonGenerator(LogIterator iter) {
} else if ((m = newElectionP.matcher(e.getEntry())).find()) {
Iterator iterator = servers.iterator();
long zxid = Long.valueOf(m.group(2));
-   int count = (int)zxid;// & 0xL;
-   int epoch = (int)Long.rotateRight(zxid, 32);// >> 32;
+   long count = zxid & 0xffL;
--- End diff --

Yeah, you are right!


> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-05-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022350#comment-16022350
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2789:
---

Github user nerdyyatrice commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/262#discussion_r118167311
  
--- Diff: 
src/contrib/loggraph/src/java/org/apache/zookeeper/graph/JsonGenerator.java ---
@@ -121,8 +121,8 @@ public JsonGenerator(LogIterator iter) {
} else if ((m = newElectionP.matcher(e.getEntry())).find()) {
Iterator iterator = servers.iterator();
long zxid = Long.valueOf(m.group(2));
-   int count = (int)zxid;// & 0xL;
-   int epoch = (int)Long.rotateRight(zxid, 32);// >> 32;
+   long count = zxid & 0xffL;
+   int epoch = (int)Long.rotateRight(zxid, 40);// >> 40;
--- End diff --

same,  40 shouldn't fly around in the code base like this


> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-05-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022348#comment-16022348
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2789:
---

Github user nerdyyatrice commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/262#discussion_r118167130
  
--- Diff: 
src/contrib/loggraph/src/java/org/apache/zookeeper/graph/JsonGenerator.java ---
@@ -121,8 +121,8 @@ public JsonGenerator(LogIterator iter) {
} else if ((m = newElectionP.matcher(e.getEntry())).find()) {
Iterator iterator = servers.iterator();
long zxid = Long.valueOf(m.group(2));
-   int count = (int)zxid;// & 0xL;
-   int epoch = (int)Long.rotateRight(zxid, 32);// >> 32;
+   long count = zxid & 0xffL;
--- End diff --

How can this be all over the code base instead of a function somewhere in a 
util file


> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-05-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16020942#comment-16020942
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2789:
---

Github user asdf2014 commented on the issue:

https://github.com/apache/zookeeper/pull/262
  
Seems like all test cases 
[passed](https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/738/testReport/),
 but some problems happened in `Zookeeper_operations` :: 
`testOperationsAndDisconnectConcurrently1`:
```bash
 [exec] BUILD FAILED
 [exec] 
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/build.xml:1298:
 The following error occurred while executing this line:
 [exec] 
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/build.xml:1308:
 exec returned: 2
 [exec] 
 [exec] Total time: 15 minutes 45 seconds
 [exec] /bin/kill -9 16911 
 [exec]  [exec] Zookeeper_operations::testAsyncWatcher1 : assertion 
: elapsed 1044
 [exec]  [exec] Zookeeper_operations::testAsyncGetOperation : 
elapsed 4 : OK
 [exec]  [exec] 
Zookeeper_operations::testOperationsAndDisconnectConcurrently1FAIL: zktest-mt
 [exec]  [exec] ==
 [exec]  [exec] 1 of 2 tests failed
 [exec]  [exec] Please report to u...@zookeeper.apache.org
 [exec]  [exec] ==
 [exec]  [exec] make[1]: Leaving directory 
`/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/build/test/test-cppunit'
 [exec]  [exec] /bin/bash: line 5: 15116 Segmentation fault  
ZKROOT=/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/src/c/../..
 CLASSPATH=$CLASSPATH:$CLOVER_HOME/lib/clover.jar ${dir}$tst
 [exec]  [exec] make[1]: *** [check-TESTS] Error 1
 [exec]  [exec] make: *** [check-am] Error 2
 [exec] 
 [exec] Running contrib tests.
 [exec] 
==
 [exec] 
 [exec] /home/jenkins/tools/ant/apache-ant-1.9.9/bin/ant 
-DZookeeperPatchProcess= -Dtest.junit.output.format=xml -Dtest.output=yes 
test-contrib
 [exec] Buildfile: 
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/build.xml
 [exec] 
 [exec] test-contrib:
 [exec] 
 [exec] BUILD SUCCESSFUL
 [exec] Total time: 0 seconds
```


> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-05-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16020828#comment-16020828
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2789:
---

Github user asdf2014 commented on the issue:

https://github.com/apache/zookeeper/pull/262
  
Why `jenkins` reported the following message:
```bash
mv: 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess'
 and 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess'
 are the same file
```


> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-05-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16020672#comment-16020672
 ] 

Hadoop QA commented on ZOOKEEPER-2789:
--

-1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 15 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/738//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/738//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/738//console

This message is automatically generated.

> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-05-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16020647#comment-16020647
 ] 

Hadoop QA commented on ZOOKEEPER-2789:
--

-1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 15 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/737//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/737//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/737//console

This message is automatically generated.

> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-05-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16020640#comment-16020640
 ] 

Hadoop QA commented on ZOOKEEPER-2789:
--

-1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 15 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/736//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/736//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/736//console

This message is automatically generated.

> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-05-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16020636#comment-16020636
 ] 

Hadoop QA commented on ZOOKEEPER-2789:
--

-1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 15 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/735//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/735//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/735//console

This message is automatically generated.

> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-05-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16020603#comment-16020603
 ] 

Hadoop QA commented on ZOOKEEPER-2789:
--

-1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 9 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/734//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/734//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/734//console

This message is automatically generated.

> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-05-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16020594#comment-16020594
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2789:
---

Github user asdf2014 commented on the issue:

https://github.com/apache/zookeeper/pull/262
  
Thinking about some abnormal situations, maybe 24 bit for `epoch` and 40 
bit for `counter` is more better choice: $Math.min(2^{24} / (24 * 365), 2^{40} 
/ (86400 * 1000 * 365)) \approx Math.min(1915.2, 34.9) = 34.9$


> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-05-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16020569#comment-16020569
 ] 

Hadoop QA commented on ZOOKEEPER-2789:
--

-1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 9 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/733//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/733//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/733//console

This message is automatically generated.

> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2789) Reassign `ZXID` for solving 32bit overflow problem

2017-05-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16020563#comment-16020563
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2789:
---

GitHub user asdf2014 opened a pull request:

https://github.com/apache/zookeeper/pull/262

ZOOKEEPER-2789: Reassign `ZXID` for solving 32bit overflow problem

If it is `1k/s` ops, then as long as $2^{32} / (86400 * 1000) \approx 49.7$ 
days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for `epoch` 
and 48bit for `counter`, then the problem will not occur until after  
$Math.min(2^{16} / 365, 2^{48} / (86400 * 1000 * 365)) \approx Math.min(179.6, 
8925.5) = 179.6$ years.

However, i thought the ZXID is `long` type, reading and writing the long 
type (and `double` type the same) in JVM, is divided into high 32bit and low 
32bit part of the operation, and because the `ZXID` variable is not  modified 
with `volatile` and is not boxed for the corresponding reference type (`Long` / 
`Double`), so it belongs to [non-atomic operation] 
(https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 32 
bits of the `long`, there may be a concurrent problem.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/asdf2014/zookeeper reassign_zxid

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/262.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #262


commit 0d7dabb6b6a42f430223c51a1738e03d4d340c79
Author: asdf2014 <1571805...@qq.com>
Date:   2017-05-23T02:02:14Z

ZOOKEEPER-2789: Reassign `ZXID` for solving 32bit overflow problem




> Reassign `ZXID` for solving 32bit overflow problem
> --
>
> Key: ZOOKEEPER-2789
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2789
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Benedict Jin
> Fix For: 3.6.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> If it is `1k/s` ops, then as long as $2^32 / (86400 * 1000) \approx 49.7$ 
> days ZXID will exhausted. But, if we reassign the `ZXID` into 16bit for 
> `epoch` and 48bit for `counter`, then the problem will not occur until after  
> $Math.min(2^16 / 365, 2^48 / (86400 * 1000 * 365)) \approx Math.min(179.6, 
> 8925.5) = 179.6$ years.
> However, i thought the ZXID is `long` type, reading and writing the long type 
> (and `double` type the same) in JVM, is divided into high 32bit and low 32bit 
> part of the operation, and because the `ZXID` variable is not  modified with 
> `volatile` and is not boxed for the corresponding reference type (`Long` / 
> `Double`), so it belongs to [non-atomic operation] 
> (https://docs.oracle.com/javase/specs/jls/se8 /html/jls-17.html#jls-17.7). 
> Thus, if the lower 32 bits of the upper 32 bits are divided into the entire 
> 32 bits of the `long`, there may be a concurrent problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)