[GitHub] incubator-predictionio issue #345: [PIO-30] Set up a cross build for Spark 2...
Github user shimamoto commented on the issue: https://github.com/apache/incubator-predictionio/pull/345 Hi @chanlee514 I am interested in Spark 2.x support. But this PR bothers me a little. Do you mean that the true nature of this PR is that we can choose either Spark 1.6 or Spark 2.x when to run `make-distribution.sh`? I mean we cannot choose which Scala version to use. If we choose 1.6, Scala version is determined 2.10 because Spark 1.x is built with Scala 2.10 by default, otherwise 2.11. But #295 comment says that we can configure the following: ``` -Dbuild.profile=scala-2.11 -Dspark.version=1.6.0 ``` If we configure this, PredictionIO should download the Spark source package and build with Scala 2.11 support(http://spark.apache.org/docs/1.6.3/building-spark.html#building-for-scala-211). Is this the same as what you imagined? Could you inform me of the future policy of Spark and Scala support? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (PIO-30) Cross build for different versions of scala and spark
[ https://issues.apache.org/jira/browse/PIO-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893682#comment-15893682 ] ASF GitHub Bot commented on PIO-30: --- Github user shimamoto commented on the issue: https://github.com/apache/incubator-predictionio/pull/345 Hi @chanlee514 I am interested in Spark 2.x support. But this PR bothers me a little. Do you mean that the true nature of this PR is that we can choose either Spark 1.6 or Spark 2.x when to run `make-distribution.sh`? I mean we cannot choose which Scala version to use. If we choose 1.6, Scala version is determined 2.10 because Spark 1.x is built with Scala 2.10 by default, otherwise 2.11. But #295 comment says that we can configure the following: ``` -Dbuild.profile=scala-2.11 -Dspark.version=1.6.0 ``` If we configure this, PredictionIO should download the Spark source package and build with Scala 2.11 support(http://spark.apache.org/docs/1.6.3/building-spark.html#building-for-scala-211). Is this the same as what you imagined? Could you inform me of the future policy of Spark and Scala support? > Cross build for different versions of scala and spark > - > > Key: PIO-30 > URL: https://issues.apache.org/jira/browse/PIO-30 > Project: PredictionIO > Issue Type: Improvement >Reporter: Marcin ZiemiĆski >Assignee: Chan > Fix For: 0.11.0 > > > The present version of Scala is 2.10 and Spark is 1.4, which is quite old. > With Spark 2.0.0 come many performance improvements and features, that people > will definitely like to add to their templates. I am also aware that past > cannot be ignored and simply dumping 1.x might not be an option for other > users. > I propose setting up a crossbuild in sbt to build with scala 2.10 and Spark > 1.6 and a separate one for Scala 2.11 and Spark 2.0. Most of the files will > be consistent between versions including API. The problematic ones will be > divided between additional source directories: src/main/scala-2.10/ and > src/main/scala-2.11/. The dockerized tests should also take the two versions > into consideration -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] incubator-predictionio issue #352: [PIO-49] Add support for Elasticsearch 5 ...
Github user marevol commented on the issue: https://github.com/apache/incubator-predictionio/pull/352 For integration tests of ES5, 3 tests are failed on [move-storages-es5]...(https://github.com/marevol/incubator-predictionio/tree/move-storages-es5) I'll check them today. ``` == ERROR [0.555s]: runTest (pio_tests.scenarios.quickstart_test.QuickStartTest) -- Traceback (most recent call last): File "//PredictionIO/tests/pio_tests/scenarios/quickstart_test.py", line 67, in setUp .format(self.training_data_path)) File "/PredictionIO/tests/pio_tests/utils.py", line 33, in srun stderr=globals.std_err(), check=True) File "/usr/lib/python3.5/subprocess.py", line 708, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command 'curl https://raw.githubusercontent.com/apache/spark/master/data/mllib/sample_movielens_data.txt --create-dirs -o /PredictionIO/tests/pio_tests/data/quickstart_test/training_data.txt' returned non-zero exit status 6 == ERROR [510.444s]: runTest (pio_tests.scenarios.eventserver_test.EventserverTest) -- Traceback (most recent call last): File "//PredictionIO/tests/pio_tests/scenarios/eventserver_test.py", line 96, in runTest self.load_events("signup_events_51.json")) File "/PredictionIO/tests/pio_tests/utils.py", line 316, in import_events_batch return import_events_batch(events, self.test_context, self.id) File "/PredictionIO/tests/pio_tests/utils.py", line 157, in import_events_batch '--channel {}'.format(channel) if channel else '')) File "/PredictionIO/tests/pio_tests/utils.py", line 33, in srun stderr=globals.std_err(), check=True) File "/usr/lib/python3.5/subprocess.py", line 708, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command 'pio import --appid 1 --input /PredictionIO/tests/pio_tests/data/events.json.tmp ' returned non-zero exit status 1 == ERROR [116.174s]: runTest (pio_tests.scenarios.basic_app_usecases.BasicAppUsecases) -- Traceback (most recent call last): File "//PredictionIO/tests/pio_tests/scenarios/basic_app_usecases.py", line 80, in runTest self.check_data() File "//PredictionIO/tests/pio_tests/scenarios/basic_app_usecases.py", line 110, in check_data self.assertEqual(len(buy_events) + len(rate_events), len(r.json())) AssertionError: 40 != 39 -- Ran 3 tests in 627.175s ``` > use ES5 as default This PR puts ES1 jar file to lib/spark/ and ES5 jar to extra/ in PredictionIO-*.tar.gz. It meant that it's replaced with ES5 jar in lib/spark/. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (PIO-49) Add support for Elasticsearch 5.x
[ https://issues.apache.org/jira/browse/PIO-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893078#comment-15893078 ] ASF GitHub Bot commented on PIO-49: --- Github user marevol commented on the issue: https://github.com/apache/incubator-predictionio/pull/352 For integration tests of ES5, 3 tests are failed on [move-storages-es5]...(https://github.com/marevol/incubator-predictionio/tree/move-storages-es5) I'll check them today. ``` == ERROR [0.555s]: runTest (pio_tests.scenarios.quickstart_test.QuickStartTest) -- Traceback (most recent call last): File "//PredictionIO/tests/pio_tests/scenarios/quickstart_test.py", line 67, in setUp .format(self.training_data_path)) File "/PredictionIO/tests/pio_tests/utils.py", line 33, in srun stderr=globals.std_err(), check=True) File "/usr/lib/python3.5/subprocess.py", line 708, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command 'curl https://raw.githubusercontent.com/apache/spark/master/data/mllib/sample_movielens_data.txt --create-dirs -o /PredictionIO/tests/pio_tests/data/quickstart_test/training_data.txt' returned non-zero exit status 6 == ERROR [510.444s]: runTest (pio_tests.scenarios.eventserver_test.EventserverTest) -- Traceback (most recent call last): File "//PredictionIO/tests/pio_tests/scenarios/eventserver_test.py", line 96, in runTest self.load_events("signup_events_51.json")) File "/PredictionIO/tests/pio_tests/utils.py", line 316, in import_events_batch return import_events_batch(events, self.test_context, self.id) File "/PredictionIO/tests/pio_tests/utils.py", line 157, in import_events_batch '--channel {}'.format(channel) if channel else '')) File "/PredictionIO/tests/pio_tests/utils.py", line 33, in srun stderr=globals.std_err(), check=True) File "/usr/lib/python3.5/subprocess.py", line 708, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command 'pio import --appid 1 --input /PredictionIO/tests/pio_tests/data/events.json.tmp ' returned non-zero exit status 1 == ERROR [116.174s]: runTest (pio_tests.scenarios.basic_app_usecases.BasicAppUsecases) -- Traceback (most recent call last): File "//PredictionIO/tests/pio_tests/scenarios/basic_app_usecases.py", line 80, in runTest self.check_data() File "//PredictionIO/tests/pio_tests/scenarios/basic_app_usecases.py", line 110, in check_data self.assertEqual(len(buy_events) + len(rate_events), len(r.json())) AssertionError: 40 != 39 -- Ran 3 tests in 627.175s ``` > use ES5 as default This PR puts ES1 jar file to lib/spark/ and ES5 jar to extra/ in PredictionIO-*.tar.gz. It meant that it's replaced with ES5 jar in lib/spark/. > Add support for Elasticsearch 5.x > - > > Key: PIO-49 > URL: https://issues.apache.org/jira/browse/PIO-49 > Project: PredictionIO > Issue Type: Improvement >Reporter: Shinsuke Sugaya > > We work on meta/event storage support for Elasticsearch 5.x. > Although Elasticsearch 2.x does not allow dots in field names, > Elasticsearch 5.x supports it. So, it's better to upgrade to ES 5.x release. > Since ES 5.x provides Java Rest API client, we replaced > Transport communication with HTTP one. Therefore, our fix > uses HTTP(9200 port) only. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PIO-49) Add support for Elasticsearch 5.x
[ https://issues.apache.org/jira/browse/PIO-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893018#comment-15893018 ] ASF GitHub Bot commented on PIO-49: --- Github user dszeto commented on the issue: https://github.com/apache/incubator-predictionio/pull/352 @marevol Super cool. How's integration tests? And what do you mean by "use ES5 as default"? > Add support for Elasticsearch 5.x > - > > Key: PIO-49 > URL: https://issues.apache.org/jira/browse/PIO-49 > Project: PredictionIO > Issue Type: Improvement >Reporter: Shinsuke Sugaya > > We work on meta/event storage support for Elasticsearch 5.x. > Although Elasticsearch 2.x does not allow dots in field names, > Elasticsearch 5.x supports it. So, it's better to upgrade to ES 5.x release. > Since ES 5.x provides Java Rest API client, we replaced > Transport communication with HTTP one. Therefore, our fix > uses HTTP(9200 port) only. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] incubator-predictionio issue #355: [PIO-56] Adding embedded elasticsearch an...
Github user marevol commented on the issue: https://github.com/apache/incubator-predictionio/pull/355 Please see https://github.com/apache/incubator-predictionio/pull/352 We are working on Elasticsearch 5 support. This fix removes elasticsearch from core. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-predictionio pull request #355: [PIO-56] Adding embedded elasticse...
GitHub user lucasbm88 opened a pull request: https://github.com/apache/incubator-predictionio/pull/355 [PIO-56] Adding embedded elasticsearch and mocked configuration for tests This pull request will add a code that will avoid the need of an elasticsearch installation and pio-env configuration when running unit tests of the project core. Basically the changes are: - Adding scalamock as a dependency for project core - Modifying data Storage.scala file to allow mocked configuration - Creating a new helper object to start and shutdown embedded elasticsearch - Modifying existing tests to use new infrastructure. More details of the issue in ASF Jira: https://issues.apache.org/jira/browse/PIO-56 You can merge this pull request into a Git repository by running: $ git pull https://github.com/lucasbm88/incubator-predictionio develop Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-predictionio/pull/355.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #355 commit 077b18a5bd83f81d0e5152e197f1590fa69ba6f6 Author: administradorDate: 2017-03-02T02:48:28Z Adjusting the project core tests and Storage object in order to use an embedded elasticsearch and mocked METADATA configuration on unit tests. Fix for #PIO-56 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---