Re: [training error] ElasticsearchIllegalArgumentException

Pat Ferrel Tue, 14 Mar 2017 08:44:27 -0700

On Mar 13, 2017, at 9:05 PM, Lin Amy <[email protected]> wrote:
> 
> Hello Pat,
> 
> I am using two UR because one of them simply using popularity as 
> recommendation while another is the normal version. Both of two engines use 
> same data (app), about 10G.


The same model can serve both “recommendations” and “popular”. The simplest 
form of this is to train with all data, and query for recs passing no user or 
item in with the query, this will give you popular results. Then the same model 
and engine can be queries with a user and/or item for personalized or 
item-based recs. In the soon to be released 0.6.0 you can also have item-set 
recommendations from the same model.

No need at all for 2 URs to solve this case and these will cause 2 separate 
Processes to startup, these are moderately heavy-weight compared to one process 
using the method above.

> I have two machines running pio, one only runs ElasticSearch as slave, and 
> another runs all (Hbase, ES, Spark), so I am running `pio-start-all` on the 
> second machine. Also the two machines both contains 8 cores and 16G memory

The problem with this is that Spark has 2 parts, driver and worker, both of 
these need a lot of memory and they are both running on a single machine with 
only 16g memory for Spark Driver, Spark Worker,  HBase, and HDFS. Spark, for a 
10g dataset could need most of your available memory split between driver and 
worker in equal portions.

For an overview of how Spark works see: http://actionml.com/docs/intro_to_spark 
<http://actionml.com/docs/intro_to_spark>

Since Spark is really only needed during training you may want another machine 
for the Spark Worker if you continue to have trouble running all this on one 
machine.

> You can find the two engine.json as attached.
> Thank you for helping!
> 
> Best regards,
> Amy
> 
> Pat Ferrel <[email protected] <mailto:[email protected]>> 於 
> 2017年3月14日 週二 上午1:30寫道：
> If you are running pio-start-all you must be running everything on a single 
> machine. This is called vertical scaling and is very prone to running out of 
> resources, either compute cores, or memory. If it has been running for some 
> time you may have finally hit the limit if what you can do on the machine. 
> 
> What are the machines scecs, cores, memory? What is the size of you data? 
> Have you exported it with `pio export`? 
> 
> Also do you have a different indexName in the 2 engine.json files? And why 
> have 2 URs?
> 
> 
> 
> On Mar 12, 2017, at 8:58 PM, Lin Amy <[email protected] 
> <mailto:[email protected]>> wrote:
> 
> Hello everyone,
> 
> I got two universal recommendation engine using the same events. And this 
> morning I find the server busy running with 100% CPU, so I shut it down, 
> tried to run up all the server.
> However, after `pio-start-all` succeeded, I ran `pio train` on the two 
> engines, one succeeded with another failed. It returns the following error 
> message:
> 
> Exception in thread "main" org.apache.spark.SparkException: Job aborted due 
> to stage failure: Task 1 in stage 47.0 failed 1 times, most recent failure: 
> Lost task 1.0 in stage 47.0 (TID 156, localhost): 
> org.apache.spark.util.TaskCompletionListenerException: Found unrecoverable 
> error [10.1.3.100:9200 <http://10.1.3.100:9200/>] returned Bad Request(400) - 
> [MapperParsingException[failed to parse [t]]; nested: 
> ElasticsearchIllegalArgumentException[unknown property [obj]]; ]; Bailing 
> out..
>       at 
> org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:112)
>       at org.apache.spark.scheduler.Task.run(Task.scala:102)
>       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> 
> Any advise on how to solve the weird situation?
> Thank you!
> 
> Best regards,
> Amy
> 
> <normal_version_engine.json><popularity_engine.json>
>

Re: [training error] ElasticsearchIllegalArgumentException

Reply via email to