Hi Pierre

Attached is the patch with the change you suggested below.

I have started a pre-commit build. If it succeeds, I will commit it.

Best regards,

~ ashutosh
Ashutosh Mestry<mailto:ames...@hortonworks.com> . Staff Software Engineer . 
Hortonworks, Inc. .  +1-310-988 0670<tel:%2B1-310-988%200670>
.......
No hurry, no pause. – Tim Ferriss, Life Hacker, Author

From: Pierre Padovani <pie...@padovani.org>
Date: Monday, April 9, 2018 at 12:48 PM
To: Ashutosh Mestry <ames...@hortonworks.com>, 
"nixon.rodrig...@freestoneinfotech.com" <nixon.rodrig...@freestoneinfotech.com>
Cc: "dev@atlas.apache.org" <dev@atlas.apache.org>
Subject: Re: Atlas Startup Failure with HBase Backend

If one of you wants to update the in-flight patch, here is the code that will 
retry connecting to Cassandra for up to 9 seconds.

CassandraAuditRepositoryTest.java -

Add these constants to the top of the file:

  private static final String TEST_CLUSTER_NAME = "Test Cluster";
  private static final int CLUSTER_PORT = 9042;
  private static final String CLUSTER_HOST = "localhost";

  private static final int MAX_RETRIES = 9;


Replace the Thread.sleep with this code:

    // Retry the connection until we either connect or timeout
    Cluster.Builder cassandraClusterBuilder = Cluster.builder();
    Cluster cluster =
        
cassandraClusterBuilder.addContactPoint(CLUSTER_HOST).withClusterName(TEST_CLUSTER_NAME).withPort(CLUSTER_PORT)
            .build();
    int retryCount = 0;

    while (retryCount < MAX_RETRIES) {
      try {
        Session cassSession = cluster.connect();
        if (cassSession.getState().getConnectedHosts().size() > 0) {
          cassSession.close();
          return;
        }
      } catch (Exception e) {
        Thread.sleep(1000);
      }
      retryCount++;
    }
    throw new RuntimeException("Unable to connect to embedded Cassandra after " 
+ MAX_RETRIES + " seconds.");
  }

I can generate a patch with this as well... let me know what you want to do.

Pierre

On Mon, Apr 9, 2018 at 1:59 PM, Pierre Padovani 
<pie...@padovani.org<mailto:pie...@padovani.org>> wrote:
I believe this may be environmental... this test works locally for me:

Running org.apache.atlas.repository.audit.InMemoryAuditRepositoryTest
Running org.apache.atlas.repository.audit.CassandraAuditRepositoryTest
Running org.apache.atlas.repository.userprofile.UserProfileServiceTest
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.346 sec - in 
org.apache.atlas.repository.impexp.TypeAttributeDifferenceTest
Running org.apache.atlas.repository.migration.RelationshipMappingTest
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.689 sec - in 
org.apache.atlas.repository.audit.InMemoryAuditRepositoryTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.748 sec - in 
org.apache.atlas.repository.impexp.ImportTransformsTest
Running org.apache.atlas.repository.migration.HiveParititionTest
Running org.apache.atlas.repository.migration.HiveStocksTest
Running org.apache.atlas.repository.store.graph.v1.AtlasEntityStoreV1Test
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 28.732 sec - in 
org.apache.atlas.repository.audit.CassandraAuditRepositoryTest

The class 'CassandraAuditRepositoryTest' has a hard coded sleep in place to 
allow the embedded Cassandra to startup. It is very likely that the box this is 
being run on is slow enough to cause the tests to fail. The quickest way to 
unblock this, would be to increase the sleep from 1 second to a greater value, 
say 5 seconds. I would recommend we do this to unblock the integration tests 
short term.

Longer term, I'll change the test case to attempt to ping the Cassandra server 
during the setup method for a few times, and either exit with an error if it 
could not startup within a period of time, or after a certain number of tries. 
I created: https://issues.apache.org/jira/browse/ATLAS-2547 to track this issue.

Pierre



On Mon, Apr 9, 2018 at 1:18 PM, Nixon Rodrigues 
<nixon.rodrig...@freestoneinfotech.com<mailto:nixon.rodrig...@freestoneinfotech.com>>
 wrote:
Ashutosh, Pierre,

Can you review below unit testcase, its failing in precommit jenkins job.

Tests run: 7, Failures: 1, Errors: 0, Skipped: 6, Time elapsed: 17.872 sec
<<< FAILURE! - in 
org.apache.atlas.repository.au<http://org.apache.atlas.repository.au/>
dit.CassandraAuditRepositoryTest
setup(org.apache.atlas.repository.audit.CassandraAuditRepositoryTest)  Time
elapsed: 17.709 sec  <<< FAILURE!
org.apache.atlas.AtlasException: com.datastax.driver.core.excep
tions.NoHostAvailableException: All host(s) tried for query failed (tried:
localhost/127.0.0.1:9042(com.datastax.driver.core.exceptions.TransportException:
[localhost/127.0.0.1:9042<http://127.0.0.1:9042/>] Cannot connect))
        at com.datastax.driver.core.ControlConnection.reconnectInternal
(ControlConnection.java:233)
        at com.datastax.driver.core.ControlConnection.connect(ControlCo
nnection.java:79)
        at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1483)
        at com.datastax.driver.core.Cluster.init(Cluster.java:159)
        at com.datastax.driver.core.Cluster.connectAsync(Cluster.java:330)
        at com.datastax.driver.core.Cluster.connectAsync(Cluster.java:305)
        at com.datastax.driver.core.Cluster.connect(Cluster.java:247)
        at 
org.apache.atlas.repository.au<http://org.apache.atlas.repository.au>dit.CassandraBasedAuditReposit
ory.createSession(CassandraBasedAuditRepository.java:217)
        at 
org.apache.atlas.repository.au<http://org.apache.atlas.repository.au>dit.CassandraBasedAuditReposit
ory.startInternal(CassandraBasedAuditRepository.java:208)
        at 
org.apache.atlas.repository.au<http://org.apache.atlas.repository.au>dit.CassandraBasedAuditReposit
ory.start(CassandraBasedAuditRepository.java:196)
        at 
org.apache.atlas.repository.au<http://org.apache.atlas.repository.au>dit.CassandraAuditRepositoryTe
st.setup(CassandraAuditRepositoryTest.java:48)

On Mon, Apr 9, 2018, 11:40 PM Pierre Padovani 
<pie...@padovani.org<mailto:pie...@padovani.org>> wrote:

> Hi Ashtosh,
>
> Good catch! This looks good to me.
>
> Thanks!
>
> Pierre
>
> On Mon, Apr 9, 2018 at 12:48 PM, Ashutosh Mestry 
> <ames...@hortonworks.com<mailto:ames...@hortonworks.com>>
> wrote:
>
> > Hi
> >
> >
> >
> > Thanks for adding Cassandra support to Atlas. With this update, Atlas
> > fails on startup when used with *HBase* backend.
> >
> >
> >
> > Attached is the patch that addresses the problem. I verified it in an
> > environment with *HBase* backend. I was not able to verify it with
> > Cassandra as backend. Can you please review and let me know if the change
> > is OK? Other things look fine, I think.
> >
> >
> >
> > Best regards,
> >
> >
> >
> > *~ ashutosh*
> >
> > *Ashutosh Mestry* 
> > <ames...@hortonworks.com<mailto:ames...@hortonworks.com>>* . Staff Software 
> > Engineer .
> > Hortonworks, Inc. .  +1-310-988 0670 <%2B1-310-988%200670>*
> >
> > .......
> >
> > *No hurry, no pause. – Tim Ferriss, Life Hacker, Author*
> >
> >
> >
>



Reply via email to