[GitHub] incubator-predictionio issue #345: [PIO-30] Set up a cross build for Spark 2...

2017-03-02 Thread shimamoto
Github user shimamoto commented on the issue:

https://github.com/apache/incubator-predictionio/pull/345
  
Hi @chanlee514 

I am interested in Spark 2.x support. But this PR bothers me a little. Do 
you mean that the true nature of this PR is that we can choose either Spark 1.6 
or Spark 2.x when to run `make-distribution.sh`? I mean we cannot choose which 
Scala version to use. If we choose 1.6, Scala version is determined 2.10 
because Spark 1.x is built with Scala 2.10 by default, otherwise 2.11.

But #295 comment says that we can configure the following:
```
-Dbuild.profile=scala-2.11 -Dspark.version=1.6.0
```
If we configure this, PredictionIO should download the Spark source package 
and build with Scala 2.11 
support(http://spark.apache.org/docs/1.6.3/building-spark.html#building-for-scala-211).

Is this the same as what you imagined? Could you inform me of the future 
policy of Spark and Scala support?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (PIO-30) Cross build for different versions of scala and spark

2017-03-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PIO-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893682#comment-15893682
 ] 

ASF GitHub Bot commented on PIO-30:
---

Github user shimamoto commented on the issue:

https://github.com/apache/incubator-predictionio/pull/345
  
Hi @chanlee514 

I am interested in Spark 2.x support. But this PR bothers me a little. Do 
you mean that the true nature of this PR is that we can choose either Spark 1.6 
or Spark 2.x when to run `make-distribution.sh`? I mean we cannot choose which 
Scala version to use. If we choose 1.6, Scala version is determined 2.10 
because Spark 1.x is built with Scala 2.10 by default, otherwise 2.11.

But #295 comment says that we can configure the following:
```
-Dbuild.profile=scala-2.11 -Dspark.version=1.6.0
```
If we configure this, PredictionIO should download the Spark source package 
and build with Scala 2.11 
support(http://spark.apache.org/docs/1.6.3/building-spark.html#building-for-scala-211).

Is this the same as what you imagined? Could you inform me of the future 
policy of Spark and Scala support?


> Cross build for different versions of scala and spark
> -
>
> Key: PIO-30
> URL: https://issues.apache.org/jira/browse/PIO-30
> Project: PredictionIO
>  Issue Type: Improvement
>Reporter: Marcin ZiemiƄski
>Assignee: Chan
> Fix For: 0.11.0
>
>
> The present version of Scala is 2.10 and Spark is 1.4, which is quite old. 
> With Spark 2.0.0 come many performance improvements and features, that people 
> will definitely like to add to their templates. I am also aware that past 
> cannot be ignored and simply dumping 1.x might not be an option for other 
> users. 
> I propose setting up a crossbuild in sbt to build with scala 2.10 and Spark 
> 1.6 and a separate one for Scala 2.11 and Spark 2.0. Most of the files will 
> be consistent between versions including API. The problematic ones will be 
> divided between additional source directories: src/main/scala-2.10/ and 
> src/main/scala-2.11/. The dockerized tests should also take the two versions 
> into consideration



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] incubator-predictionio issue #352: [PIO-49] Add support for Elasticsearch 5 ...

2017-03-02 Thread marevol
Github user marevol commented on the issue:

https://github.com/apache/incubator-predictionio/pull/352
  
For integration tests of ES5, 3 tests are failed on 
[move-storages-es5]...(https://github.com/marevol/incubator-predictionio/tree/move-storages-es5)
I'll check them today.

```
==
ERROR [0.555s]: runTest (pio_tests.scenarios.quickstart_test.QuickStartTest)
--
Traceback (most recent call last):
  File "//PredictionIO/tests/pio_tests/scenarios/quickstart_test.py", line 
67, in setUp
.format(self.training_data_path))
  File "/PredictionIO/tests/pio_tests/utils.py", line 33, in srun
stderr=globals.std_err(), check=True)
  File "/usr/lib/python3.5/subprocess.py", line 708, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 'curl 
https://raw.githubusercontent.com/apache/spark/master/data/mllib/sample_movielens_data.txt
 --create-dirs -o 
/PredictionIO/tests/pio_tests/data/quickstart_test/training_data.txt' returned 
non-zero exit status 6

==
ERROR [510.444s]: runTest 
(pio_tests.scenarios.eventserver_test.EventserverTest)
--
Traceback (most recent call last):
  File "//PredictionIO/tests/pio_tests/scenarios/eventserver_test.py", line 
96, in runTest
self.load_events("signup_events_51.json"))
  File "/PredictionIO/tests/pio_tests/utils.py", line 316, in 
import_events_batch
return import_events_batch(events, self.test_context, self.id)
  File "/PredictionIO/tests/pio_tests/utils.py", line 157, in 
import_events_batch
'--channel {}'.format(channel) if channel else ''))
  File "/PredictionIO/tests/pio_tests/utils.py", line 33, in srun
stderr=globals.std_err(), check=True)
  File "/usr/lib/python3.5/subprocess.py", line 708, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 'pio import --appid 1 --input 
/PredictionIO/tests/pio_tests/data/events.json.tmp ' returned non-zero exit 
status 1

==
ERROR [116.174s]: runTest 
(pio_tests.scenarios.basic_app_usecases.BasicAppUsecases)
--
Traceback (most recent call last):
  File "//PredictionIO/tests/pio_tests/scenarios/basic_app_usecases.py", 
line 80, in runTest
self.check_data()
  File "//PredictionIO/tests/pio_tests/scenarios/basic_app_usecases.py", 
line 110, in check_data
self.assertEqual(len(buy_events) + len(rate_events), len(r.json()))
AssertionError: 40 != 39

--
Ran 3 tests in 627.175s
```

> use ES5 as default

This PR puts ES1 jar file to lib/spark/ and ES5 jar to extra/ in 
PredictionIO-*.tar.gz.
It meant that it's replaced with ES5 jar in lib/spark/.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (PIO-49) Add support for Elasticsearch 5.x

2017-03-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PIO-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893078#comment-15893078
 ] 

ASF GitHub Bot commented on PIO-49:
---

Github user marevol commented on the issue:

https://github.com/apache/incubator-predictionio/pull/352
  
For integration tests of ES5, 3 tests are failed on 
[move-storages-es5]...(https://github.com/marevol/incubator-predictionio/tree/move-storages-es5)
I'll check them today.

```
==
ERROR [0.555s]: runTest (pio_tests.scenarios.quickstart_test.QuickStartTest)
--
Traceback (most recent call last):
  File "//PredictionIO/tests/pio_tests/scenarios/quickstart_test.py", line 
67, in setUp
.format(self.training_data_path))
  File "/PredictionIO/tests/pio_tests/utils.py", line 33, in srun
stderr=globals.std_err(), check=True)
  File "/usr/lib/python3.5/subprocess.py", line 708, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 'curl 
https://raw.githubusercontent.com/apache/spark/master/data/mllib/sample_movielens_data.txt
 --create-dirs -o 
/PredictionIO/tests/pio_tests/data/quickstart_test/training_data.txt' returned 
non-zero exit status 6

==
ERROR [510.444s]: runTest 
(pio_tests.scenarios.eventserver_test.EventserverTest)
--
Traceback (most recent call last):
  File "//PredictionIO/tests/pio_tests/scenarios/eventserver_test.py", line 
96, in runTest
self.load_events("signup_events_51.json"))
  File "/PredictionIO/tests/pio_tests/utils.py", line 316, in 
import_events_batch
return import_events_batch(events, self.test_context, self.id)
  File "/PredictionIO/tests/pio_tests/utils.py", line 157, in 
import_events_batch
'--channel {}'.format(channel) if channel else ''))
  File "/PredictionIO/tests/pio_tests/utils.py", line 33, in srun
stderr=globals.std_err(), check=True)
  File "/usr/lib/python3.5/subprocess.py", line 708, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 'pio import --appid 1 --input 
/PredictionIO/tests/pio_tests/data/events.json.tmp ' returned non-zero exit 
status 1

==
ERROR [116.174s]: runTest 
(pio_tests.scenarios.basic_app_usecases.BasicAppUsecases)
--
Traceback (most recent call last):
  File "//PredictionIO/tests/pio_tests/scenarios/basic_app_usecases.py", 
line 80, in runTest
self.check_data()
  File "//PredictionIO/tests/pio_tests/scenarios/basic_app_usecases.py", 
line 110, in check_data
self.assertEqual(len(buy_events) + len(rate_events), len(r.json()))
AssertionError: 40 != 39

--
Ran 3 tests in 627.175s
```

> use ES5 as default

This PR puts ES1 jar file to lib/spark/ and ES5 jar to extra/ in 
PredictionIO-*.tar.gz.
It meant that it's replaced with ES5 jar in lib/spark/.



> Add support for Elasticsearch 5.x
> -
>
> Key: PIO-49
> URL: https://issues.apache.org/jira/browse/PIO-49
> Project: PredictionIO
>  Issue Type: Improvement
>Reporter: Shinsuke Sugaya
>
> We work on meta/event storage support for Elasticsearch 5.x.
> Although Elasticsearch 2.x does not allow dots in field names,
> Elasticsearch 5.x supports it. So, it's better to upgrade to ES 5.x release.
> Since ES 5.x provides Java Rest API client, we replaced
> Transport communication with HTTP one. Therefore, our fix
> uses HTTP(9200 port) only.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (PIO-49) Add support for Elasticsearch 5.x

2017-03-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PIO-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893018#comment-15893018
 ] 

ASF GitHub Bot commented on PIO-49:
---

Github user dszeto commented on the issue:

https://github.com/apache/incubator-predictionio/pull/352
  
@marevol Super cool. How's integration tests? And what do you mean by "use 
ES5 as default"?


> Add support for Elasticsearch 5.x
> -
>
> Key: PIO-49
> URL: https://issues.apache.org/jira/browse/PIO-49
> Project: PredictionIO
>  Issue Type: Improvement
>Reporter: Shinsuke Sugaya
>
> We work on meta/event storage support for Elasticsearch 5.x.
> Although Elasticsearch 2.x does not allow dots in field names,
> Elasticsearch 5.x supports it. So, it's better to upgrade to ES 5.x release.
> Since ES 5.x provides Java Rest API client, we replaced
> Transport communication with HTTP one. Therefore, our fix
> uses HTTP(9200 port) only.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] incubator-predictionio issue #355: [PIO-56] Adding embedded elasticsearch an...

2017-03-02 Thread marevol
Github user marevol commented on the issue:

https://github.com/apache/incubator-predictionio/pull/355
  
Please see https://github.com/apache/incubator-predictionio/pull/352
We are working on Elasticsearch 5 support.
This fix removes elasticsearch from core.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-predictionio pull request #355: [PIO-56] Adding embedded elasticse...

2017-03-02 Thread lucasbm88
GitHub user lucasbm88 opened a pull request:

https://github.com/apache/incubator-predictionio/pull/355

[PIO-56] Adding embedded elasticsearch and mocked configuration for tests

This pull request will add a code that will avoid the need of an 
elasticsearch installation and pio-env configuration when running unit tests of 
the project core.

Basically the changes are:
 - Adding scalamock as a dependency for project core
 - Modifying data Storage.scala file to allow mocked configuration
 - Creating a new helper object to start and shutdown embedded elasticsearch
 - Modifying existing tests to use new infrastructure.

More details of the issue in ASF Jira: 
https://issues.apache.org/jira/browse/PIO-56

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lucasbm88/incubator-predictionio develop

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-predictionio/pull/355.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #355


commit 077b18a5bd83f81d0e5152e197f1590fa69ba6f6
Author: administrador 
Date:   2017-03-02T02:48:28Z

Adjusting the project core tests and Storage object in order to use an 
embedded elasticsearch and mocked METADATA configuration on unit tests. Fix for 
#PIO-56




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---