Hi Madhan, Based on our meeting earlier this week, you suggested that we run with an external HBase rather than the embedded HBase, which you have found has delays around some transaction commits. I am not very familiar with HBase and wondered if you could point me in the right direction. This is what I did:
1) I ran an embedded hbase build and found the solr and hbase tar.gz files 2) I then expanded these archives in a new runtime folder 3) I built Atlas without the embedded hbase option and then copied over the atlas tree into the runtime folder. 4) I found https://atlas.apache.org/InstallationSteps.html . - It indicates I should specify export ATLAS_SERVER_OPTS="-server -XX:SoftRefLRUPolicyMSPerMB=0 -XX:+CMSClassUnloadingEnabled -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+PrintTenuringDistribution -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=dumps/atlas_server.hprof -Xloggc:logs/gc-worker.log -verbose:gc -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=1m -XX:+PrintGCDetails -XX:+PrintHeapAtGC -XX:+PrintGCTimeStamps" which I put in the atlas_env.sh - As I am on a mac it suggests I specify : export ATLAS_SERVER_OPTS="-Djava.awt.headless=true -Djava.security.krb5.realm= -Djava.security.krb5.kdc=" which I did. - It says I should change the config to: "atlas.graph.storage.hbase.table=atlas atlas.audit.hbase.tablename=apache_atlas_entity_audit " Is this correct - as the application.properties that has been generated is "atlas.graph.storage.hbase.table=apache_atlas_titan". - It says I should start solr with "$SOLR_HOME/bin/solr start -c - z <zookeeper_host:port> -p 8983". I do not know what to put in for < zookeeper_host:port>. Do I need to specify this if I am using the solr embedded zookeeper? - It then says I should run " $SOLR_BIN/solr create -c vertex_index -d SOLR_CONF -shards #numShards -replicationFactor #replicationFactor $SOLR_BIN/solr create -c edge_index -d SOLR_CONF -shards #numShards -replicationFactor #replicationFactor $SOLR_BIN/solr create -c fulltext_index -d SOLR_CONF -shards #numShards -replicationFactor #replicationFactor" It seems I need to specify numbers for numshards and replicationFactor, can I let these default, if not what do I specify here? - It then says I need to specify "atlas.graph.index.search.backend=solr5 atlas.graph.index.search.solr.mode=cloud atlas.graph.index.search.solr.zookeeper-url=<the ZK quorum setup for solr as comma separated value> eg: 10.1.6.4:2181,10.1.6.5:2181 atlas.graph.index.search.solr.zookeeper-connect-timeout=<SolrCloud Zookeeper Connection Timeout>. Default value is 60000 ms atlas.graph.index.search.solr.zookeeper-session-timeout=<SolrCloud Zookeeper Session Timeout>. Default value is 60000 ms" I am not sure what to put for atlas.graph.index.search.solr.zookeeper-url. should we be using the solr embedded ZK - if do I need this line? I updated HBASE_CONF_DIR to point to the conf folder of the hbase I had expanded. I start solr using $SOLR_HOME/bin/solr start -c -p 8983 I start hbase I start atlas. Atlas says it successfully has started but the last line in the application.log says : 2018-02-15 17:25:23,755 INFO - [main:] ~ Not running setup per configuration atlas.server.run.setup.on.start. (SetupSteps$SetupRequired:189) The installation twiki talks of "If the setup failed due to HBase JanusGraph schema setup errors, it may be necessary to repair the HBase schema. If no data has been stored, one can also disable and drop the HBase tables used by Atlas and run setup again." . It does not indicate what commands I need to run and how. many thanks , David. Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU