RE: drools on spark, how to reload rule file?

2016-04-18 Thread yaoxiaohua
Thanks for your reply , Jason,

I can use stateless session in spark streaming job.

But now my question is when the rule update, how to pass it to RDD?

We generate a ruleExecutor(stateless session) in main method,

Then pass the ruleExectutor in Rdd.

 

I am new in drools, I am trying to read the drools doc now.

Best Regards,

Evan

From: Jason Nerothin [mailto:jasonnerot...@gmail.com] 
Sent: 2016年4月18日 21:42
To: yaoxiaohua
Cc: user@spark.apache.org
Subject: Re: drools on spark, how to reload rule file?

 

The limitation is in the drools implementation.

 

Changing a rule in a stateful KB is not possible, particularly if it leads to 
logical contradictions with the previous version or any other rule in the KB.

 

When we ran into this, we worked around (part of) it by salting the rule name 
with a unique id. To get the existing rules to be evaluated when we wanted, we 
kept a property on each fact that we mutated each time. 

 

Hackery, but it worked.

 

I recommend you try hard to use a stateless KB, if it is possible.

Thank you.

 

Jason

 

// brevity and poor typing by iPhone


On Apr 18, 2016, at 04:43, yaoxiaohua <yaoxiao...@outlook.com> wrote:

Hi bros,

I am trying using drools on spark to parse log and do some rule 
match and derived some fields.

Now I refer one blog on cloudera, 

http://blog.cloudera.com/blog/2015/11/how-to-build-a-complex-event-processing-app-on-apache-spark-and-drools/



now I want to know whether it possible to reload the rule on 
the fly?

Thanks in advance.

 

Best Regards,

Evan



drools on spark, how to reload rule file?

2016-04-18 Thread yaoxiaohua
Hi bros,

I am trying using drools on spark to parse log and do some
rule match and derived some fields.

Now I refer one blog on cloudera, 

http://blog.cloudera.com/blog/2015/11/how-to-build-a-complex-event-processin
g-app-on-apache-spark-and-drools/



now I want to know whether it possible to reload the rule on
the fly?

Thanks in advance.

 

Best Regards,

Evan



how to use custom properties in spark app

2016-04-05 Thread yaoxiaohua
Hi bro,

I am new in spark application develop, I need develop two
app running on spark cluster.

Now I have some arguments for the application.

I can pass them as the program arguments when spark-submit,
I want to find a new way.

I have some arguments such as ,jdbc url, elastic search
nodes , kafka group id,

So I want to know whether there is a best practices to do
this. 

Can I read this from one custom properties file?



Another similar question is that, I will read some rules to
analysis data,

Now I store the rules in mysql,  before I use it ,I read it
from mysql.

Is there better way to do this?

Thanks in advance.

 

Best Regards,

Evan Yao



what is the best practice to read configure file in spark streaming

2016-03-15 Thread yaoxiaohua
Hi guys,

I'm using kafka+spark streaming do log analysis.

Now my requirement is that the log alarm rules may change
sometimes.

Rules maybe like this:

App=Hadoop,keywords=oom|Exception|error,threshold=10

The threshold or keywords may update sometimes.

What I do is :

1.   Use a Map[app,logrule] variable to store the log rules. Define it
as a static member.

2.   Use a custom StreamingListener , read the configuration file in
event onBatchStarted

3.   When I use the variable , I found the value is not updated in
windowstream. So now I read the configure file when use

4.   Now I put the log rule in a local path, I should put it in every
worker. 

What 's the best practice to do in this case?

Thanks for your help, I 'm new in spark-streaming , I even not totally
understand the principle.

 

Best Regards,

Evan Yao



Client session timed out, have not heard from server in

2015-12-22 Thread yaoxiaohua
Hi,

I encounter a similar question, spark1.4

Master2 run some days , then give a timeout exception, then shutdown.

I found a bug :

https://issues.apache.org/jira/browse/SPARK-9629

 


INFO ClientCnxn: Client session timed out, have not heard from server in
40015ms for sessionid 0x351c416297a145a, closing socket connection and
attempting reconnect

 

 

could you tell me what do you do for this?

 

Best Regards,

Evan



RE: Client session timed out, have not heard from server in

2015-12-22 Thread yaoxiaohua
Thanks for your reply.

I find spark-env.sh :

SPARK_JAVA_OPTS="$SPARK_JAVA_OPTS -Dspark.akka.askTimeout=300 
-Dspark.ui.retainedStages=1000 -Dspark.eventLog.enabled=true 
-Dspark.eventLog.dir=hdfs://sparkcluster/user/spark_history_logs 
-Dspark.shuffle.spill=false -Dspark.shuffle.manager=hash 
-Dspark.yarn.max.executor.failures=9 -Dspark.worker.timeout=300"

 

I just find log like this:


INFO ClientCnxn: Client session timed out, have not heard from server in 
40015ms for sessionid 0x351c416297a145a, closing socket connection and 
attempting reconnect

Before spark2 master process shut down.

I don’t see any zookeeper timeout setting .

 

Best 

 

From: Yash Sharma [mailto:yash...@gmail.com] 
Sent: 2015年12月22日 19:55
To: yaoxiaohua
Cc: user@spark.apache.org
Subject: Re: Client session timed out, have not heard from server in

 

Hi Evan, 
SPARK-9629 referred to connection issues with zookeeper.  Could you check if 
its working fine in your setup. 

Also please share other error logs you might be getting. 

- Thanks, via mobile,  excuse brevity. 

On Dec 22, 2015 5:00 PM, "yaoxiaohua" <yaoxiao...@outlook.com> wrote:

Hi,

I encounter a similar question, spark1.4

Master2 run some days , then give a timeout exception, then shutdown.

I found a bug :

https://issues.apache.org/jira/browse/SPARK-9629

 


INFO ClientCnxn: Client session timed out, have not heard from server in 
40015ms for sessionid 0x351c416297a145a, closing socket connection and 
attempting reconnect

 

 

could you tell me what do you do for this?

 

Best Regards,

Evan



spark master process shutdown for timeout

2015-12-16 Thread yaoxiaohua
Hi guys,

I have two nodes used as spark master, spark1,spark2

Spark1.4.0

Jdk 1.7 sunjdk

 

Now these days I found that spark2 master process may shutdown , I found
that in log file:

15/12/17 13:09:58 INFO ClientCnxn: Client session timed out, have not heard
from server in 40020ms for sessionid 0x351a889694144b1, closing socket
connection and attempting reconnect

15/12/17 13:09:58 INFO ConnectionStateManager: State change: SUSPENDED

15/12/17 13:09:58 INFO ZooKeeperLeaderElectionAgent: We have lost leadership

15/12/17 13:09:58 ERROR Master: Leadership has been revoked -- master
shutting down.

15/12/17 13:09:58 INFO Utils: Shutdown hook called

 

It looks like timeout , I don't know how to change the configure to avoid
this case happened, please help me.

Thanks.

 

Best Regards,

Evan Yao