[GitHub] [gora] djkevincr commented on issue #161: GORA-565: Enable Spark in Unit Tests
djkevincr commented on issue #161: GORA-565: Enable Spark in Unit Tests URL: https://github.com/apache/gora/pull/161#issuecomment-488335336 I will merge this PR as it is and send a new PR to fix the failing tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] djkevincr commented on issue #161: GORA-565: Enable Spark in Unit Tests
djkevincr commented on issue #161: GORA-565: Enable Spark in Unit Tests URL: https://github.com/apache/gora/pull/161#issuecomment-487840780 ``` public Configuration generateOutputConf(DataStore dataStore) throws IOException { Configuration hadoopConf = ((Configurable) dataStore).getConf(); // Configuration hadoopConf = new Configuration(); - previously it was GoraMapReduceUtils.setIOSerializations(hadoopConf, true); Job job = Job.getInstance(hadoopConf); return generateOutputConf(job, dataStore.getClass(), dataStore.getKeyClass(), dataStore.getPersistentClass()); } ``` To properly fix the issue, I think GoraSparkEngine class method generateOutputConf should be changed to the above. ( Check java single line comment ) Basically we should reuse the Hadoop conf created at GoraMongodbTestDriver through out entire tests. Initial test startup phase, we set these mongo server details to Hadoop conf created. I am not sure whether there exist a case which returns dataStore conf as null. ( Eg:- de serilizations ) so we can do something similar to below. This code is extreacted from the GoraSparkEngine class method initialize. Can you please address the changes and update the PR? ``` if ((dataStore instanceof Configurable) && ((Configurable) dataStore).getConf() != null) { hadoopConf = ((Configurable) dataStore).getConf(); } else { hadoopConf = new Configuration(); } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] djkevincr commented on issue #161: GORA-565: Enable Spark in Unit Tests
djkevincr commented on issue #161: GORA-565: Enable Spark in Unit Tests URL: https://github.com/apache/gora/pull/161#issuecomment-487832799 @sneceesay77 I have locally tested your PR and ran into some issues. ``` # Don't override properties coming from Hadoop configuration for test # Those properties will contains override for Mongo server port gora.mongodb.override_hadoop_configuration=true ``` I noticed you have made the following property change and due to this, Apache Flink word count test is failing in mongo-db module. As I can see from the log traces, while running Flink jobs it fails to connect to the Mongo server running on the correct port. Basically we should not override the mongo server port which is set to Hadoop conf ( conf.set(MongoStoreParameters.PROP_MONGO_SERVERS, "127.0.0.1:" + port); ) in GoraMongodbTestDriver from the port added in the property file. ( Value in Hadoop conf should have the priority ) Can you please check on this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] djkevincr commented on issue #161: GORA-565: Enable Spark in Unit Tests
djkevincr commented on issue #161: GORA-565: Enable Spark in Unit Tests URL: https://github.com/apache/gora/pull/161#issuecomment-484770502 @sneceesay77 This looks great, thank you for the contribution. I ll keep this PR for a while, so that others can have a look. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services