[GitHub] [accumulo] jzgithub1 commented on issue #1225: Use fewer ZooKeeper watchers

2019-10-02 Thread GitBox
jzgithub1 commented on issue #1225: Use fewer ZooKeeper watchers
URL: https://github.com/apache/accumulo/issues/1225#issuecomment-537537892
 
 
   
   @belugabehr,  I like Curator too.  I have been running the TreeCacheExample 
alongside my Accumulo instance to check against my debug tracing of what goes 
on with the Accumulo Zookeeper cache.  But for now we have to stop shooting 
ourselves in the foot by blowing away table configurations in the ZooCache 
unnecessarily.  
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [accumulo] jzgithub1 commented on issue #1225: Use fewer ZooKeeper watchers

2019-10-02 Thread GitBox
jzgithub1 commented on issue #1225: Use fewer ZooKeeper watchers
URL: https://github.com/apache/accumulo/issues/1225#issuecomment-537533690
 
 
   I have implemented what @ctubbsii recommended at the top of the ticket but 
it is not perfected yet (even though it seems to run well) so I will not make a 
pull request at this time.  
   
   What I learned along the way by trace debugging ZooCache.get(zPath) function 
and the process function in ZCacheWatcher (and other parts of the code also)  
pretty much convinces me that @ctubbsii's idea is a sound solution.   The main 
reason for this is that it will prevent the removal from the ZCache of all of 
the table configurations for all of the tables  during a "createtable" and 
"deletetable" or "clonetable" operation.   The TServers wipe out Znodes from 
the cache that start with "/accumulo/{INSTANCE_ID/tables" during  the former 
mentioned operations (this occurs for important reasons I will try to 
understand more thoroughly).  Then when ZCache.get(path) is called for a 
configuration path that should not have been deleted,  that value is not in the 
ZCache and needs to get refreshed by a new call to Zookeeper exist and then 
getData.  If you have a lot of tables over many TServers doing a lot of 
add/deletes of tables this will burden Zookeeper.  
   
   I moved the table configurations out of the former mentioned  Zoo Path to 
/accumulo/{INSTANCE_ID}/table_configs/table/{TABLE_ID}/conf  to prevent the 
erasure and then re-fetching of table configurations from Zookeeper.  
   
   I can see in the trace debug that calls to get the table configurations are 
using the new ZNode path and they are consistently retrieved from the ZooCache 
with contacting the Zookeeper again.  This is a good thing.  The present code 
pulls from the cache too but it has to refreshed again if the 
"/accumulo/{INSTANCE_ID/tables" path has bee wiped out in the "NodeDeleted" 
case in ZCacheWatcher.process.
   Even though a Watcher is still placed on each table configuration item in my 
code its a one time event usually.  I don't think that putting a watcher on 
ZNode path "/accumulo/{INSTANCE_ID}/table_configs/" is required.   We would 
have to implement an action NodeChildrenChanged case part of the 
ZCacheWatcher.process function (I have done this in one of my branches) which 
is burdensome on zookeeper.  Maybe just calling ZooCache.get(event.getPath) in 
the NodeDataChanged case part of process would be more efficient.
   
   I have re-set these new ZooNode paths inside the Zookeeper CLI and seen them 
update in the ZooCache just fine with trace debugging.  In addition I have run 
accumulo-testing's createtable and ingest and everything works fine.  Cloning 
the ingested table is working too.   I will ask @ivakegg take a look at my 
solution sometime this week if he has time.
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [accumulo] jzgithub1 commented on issue #1225: Use fewer ZooKeeper watchers

2019-09-17 Thread GitBox
jzgithub1 commented on issue #1225: Use fewer ZooKeeper watchers
URL: https://github.com/apache/accumulo/issues/1225#issuecomment-532276009
 
 
   Do the high risk of unintended effects I will fall back to implementing the 
idea mentioned at the top of the the ticket:
   
   _One possible solution is instead of having a watcher for every 
configuration item, we have only a single configuration version field in 
ZooKeeper for all configuration items, and a single watcher (per process) to 
track that version field and reload all configurations whenever it is changed. 
This would require us to ensure that we increment this field whenever a 
configuration is changed._


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [accumulo] jzgithub1 commented on issue #1225: Use fewer ZooKeeper watchers

2019-09-16 Thread GitBox
jzgithub1 commented on issue #1225: Use fewer ZooKeeper watchers
URL: https://github.com/apache/accumulo/issues/1225#issuecomment-531837738
 
 
   More TreeCache listener output showing how the cache is updated during 
continuous ingest. 
   
   
[tree_cache.txt](https://github.com/apache/accumulo/files/3617057/tree_cache.txt)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [accumulo] jzgithub1 commented on issue #1225: Use fewer ZooKeeper watchers

2019-09-13 Thread GitBox
jzgithub1 commented on issue #1225: Use fewer ZooKeeper watchers
URL: https://github.com/apache/accumulo/issues/1225#issuecomment-531360046
 
 
   @ctubbsii.  I investigated how I would implement:" _One possible solution is 
instead of having a watcher for every configuration item, we have only a single 
configuration version field in ZooKeeper for all configuration items, and a 
single watcher (per process) to track that version field and reload all 
configurations whenever it is changed. This would require us to ensure that we 
increment this field whenever a configuration is changed._"
   
   I saw a lot of complexity inside of the ZooCache and ZooReader objects that 
makes that approach seem difficult to implement in way that we could be sure 
would work the intended way all of the time.   I looked at using some of the 
other members of the Zookeeper Stat object like version and cversion to help us 
keep track of version state but this could lead to some logical errors down the 
road.
   
   Ultimately, I looked at the Apache Curator project.   I ran the the 
TreeCacheExample  which runs continuously against a ZooKeeper  instance.  I 
started up and instance of Flun Uno and then started 'cingest ingest'  in 
Accumulo-Testing.  In that TreeCacheExample, a listener function is added to 
the CuratorFramework client object and another listener function is added to 
the TreeCache object.   These listeners captured all of the actions on all 
Zookeeper paths as they were occurring during the ingest.  
   
   The TreeCache object in Curator really seems to do what we want to in terms 
of reducing watchers and getting reliable access to data, and not grow maps so 
large that it causes problems.   
   
   I would like to put a TreeCache object inside of ZooCache and remove the 
maps and locks that may be causing issues.  The TreeCache object internally 
does retries and handles all of the concurrency issues and data storage issues 
that ZooCache does.  Possibly it does it better.  What do you think about this 
approach?
   
   Here is the TreeCacheExample code:
   ```java
   
   /**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements.  See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership.  The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License.  You may obtain a copy of the License at
*
*   http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied.  See the License for the
* specific language governing permissions and limitations
* under the License.
*/
   package cache;
   
   import framework.CreateClientExamples;
   import org.apache.curator.framework.CuratorFramework;
   import org.apache.curator.framework.recipes.cache.TreeCache;
   import java.io.BufferedReader;
   import java.io.InputStreamReader;
   
   public class TreeCacheExample
   {
   public static void main(String[] args) throws Exception
   {
   CuratorFramework client = 
CreateClientExamples.createSimple("127.0.0.1:2181");
   client.getUnhandledErrorListenable().addListener((message, e) -> {
   System.err.println("error=" + message);
   e.printStackTrace();
   });
   client.getConnectionStateListenable().addListener((c, newState) -> {
   System.out.println("state=" + newState);
   });
   client.start();
   
   TreeCache cache = TreeCache.newBuilder(client, 
"/").setCacheData(false).build();
   cache.getListenable().addListener((c, event) -> {
   if ( event.getData() != null )
   {
   System.out.println("type=" + event.getType() + " path=" + 
event.getData().getPath());
   }
   else
   {
   System.out.println("type=" + event.getType());
   }
   });
   cache.start();
   
   BufferedReader in = new BufferedReader(new 
InputStreamReader(System.in));
   in.readLine();
   }
   }
   ```
   
   Here is some sample output while running ingest in Uno:
   
   state=CONNECTED
   type=NODE_ADDED path=/
   type=NODE_ADDED path=/accumulo
   type=NODE_ADDED path=/tracers
   type=NODE_ADDED path=/zookeeper
   type=NODE_ADDED path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c
   type=NODE_ADDED path=/accumulo/instances
   type=NODE_ADDED path=/tracers/trace-00
   type=NODE_ADDED path=/zookeeper/quota
   type=NODE_ADDED 
path=/accumulo/01414446-ec20-4766-a586-9ae08ed91c0c/bulk_failed_copyq
   type=NODE_ADDED 

[GitHub] [accumulo] jzgithub1 commented on issue #1225: Use fewer ZooKeeper watchers

2019-08-09 Thread GitBox
jzgithub1 commented on issue #1225: Use fewer ZooKeeper watchers
URL: https://github.com/apache/accumulo/issues/1225#issuecomment-519900033
 
 
   @ctubbsii , Zookeeper 3.5.5 seems to depend on this project that is not 
Apache Foundation but the source code at quick glance does not seem harmless 
but it has Chinese origin.
   
   https://github.com/wangfucai/zookeeper-jute


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [accumulo] jzgithub1 commented on issue #1225: Use fewer ZooKeeper watchers

2019-08-09 Thread GitBox
jzgithub1 commented on issue #1225: Use fewer ZooKeeper watchers
URL: https://github.com/apache/accumulo/issues/1225#issuecomment-519895280
 
 
   @nkalmar,  I get this warning when I try to build Accumulo with zookeeper 
3.5.5.
   
   [WARNING] Used undeclared dependencies found:
   [WARNING]org.apache.zookeeper:zookeeper-jute:jar:3.5.5:compile
   
   It ultimately causes a failure in the build of Accumulo Core.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [accumulo] jzgithub1 commented on issue #1225: Use fewer ZooKeeper watchers

2019-07-24 Thread GitBox
jzgithub1 commented on issue #1225: Use fewer ZooKeeper watchers
URL: https://github.com/apache/accumulo/issues/1225#issuecomment-514749467
 
 
   In the accumulo pom.xml I set the zookeeper.version to 3.5.0-SNAPSHOT and 
Accumulo did build.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services