Re: [Dev] Issues when using DAS features in an external Spark (non-OSGi) environment

2015-06-29 Thread Sinthuja Ragendran
Hi Gihan,

IMHO the recommended behaviour of the server needs to be as default
configuration, and since our recommendation is configure purging to have
smooth operation without getting accumulated with high amount of data in
the data store. Therefore can we have some reasonable configuration for
purging enabled with data retention period is 3 months/ 90 days, and run
the purging task every week?

Thanks,
Sinthuja.

On Mon, Jun 29, 2015 at 2:46 PM, Gihan Anuruddha gi...@wso2.com wrote:

 Hi Sinthuja,

 Yes, by default data purging is disabled. It is a dev ops decision to
 enable or disable the purging task and come-up with suitable input like
 retention period, purge enabled tables etc.
 Regards,
 Gihan

 On Mon, Jun 29, 2015 at 2:38 PM, Sinthuja Ragendran sinth...@wso2.com
 wrote:

 Hi Nirmal,

 When the purging disabled, if there is already registered purging task
 then it'll be deleted, and therefore the it's anyhow required to access the
 task service if it's enabled/disabled.

 But we can check the existence of task service, and do the analytics
 purging related operation if and only if the task service is registered,
 with this we can resolve the issue irrespective of above configuration. And
 we can log a warn message if the task service is not registered and purging
 task is enabled.

 @Gihan: I think by default the purging needs to be enabled for the
 continuous operation with RDBMS datasource, without too many data being
 accumulated in the datasource. Any reason for this to be disabled?

 Thanks,
 Sinthuja.

 On Mon, Jun 29, 2015 at 2:07 PM, Nirmal Fernando nir...@wso2.com wrote:

 That worked Sinthuja! Thanks. However, is it possible to disable the
 Task Service initialization if the purging is disabled (which is the
 default behaviour)?

 analytics-data-purging

   purging-enablefalse/purging-enable

   purge-nodetrue/purge-node

   cron-expression0 0 0 * * ?/cron-expression

   purge-include-table-patterns

  table.*/table

   /purge-include-table-patterns

   data-retention-days365/data-retention-days

/analytics-data-purging

 On Mon, Jun 29, 2015 at 1:57 PM, Sinthuja Ragendran sinth...@wso2.com
 wrote:

 Hi Nirmal,

 Thanks for sharing the necessary details. It's due to the data purging
 configuration has been enabled in the analytics-conf.xml which uses the
 task internally. can you please try to comment the analytics purging
 configuration from the repository/conf/analytics/analytics-conf.xml and 
 see?

 Thanks,
 Sinthuja.

 On Mon, Jun 29, 2015 at 1:44 PM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Sinthuja,

 Thanks for the explanation. I think I should have used DAL instead of
 DAS. Yes, so what I talking here is about the DAL features. Exact error is
 [1] and reason for this is TaskService being null. Can you please check?

 [1]

 15/06/28 11:54:51 INFO MemoryStore: Block broadcast_0 stored as values
 in memory (estimated size 3.4 KB, free 265.1 MB)

 15/06/28 11:55:02 ERROR Executor: Exception in task 0.0 in stage 0.0
 (TID 0)

 java.lang.NullPointerException

 at
 org.wso2.carbon.analytics.dataservice.AnalyticsDataServiceImpl.init(AnalyticsDataServiceImpl.java:149)

 at
 org.wso2.carbon.analytics.dataservice.AnalyticsServiceHolder.checkAndPopulateCustomAnalyticsDS(AnalyticsServiceHolder.java:79)

 at
 org.wso2.carbon.analytics.dataservice.AnalyticsServiceHolder.getAnalyticsDataService(AnalyticsServiceHolder.java:67)

 at
 org.wso2.carbon.analytics.spark.core.internal.ServiceHolder.getAnalyticsDataService(ServiceHolder.java:73)

 at
 org.wso2.carbon.analytics.spark.core.util.AnalyticsRDD.compute(AnalyticsRDD.java:81)

 at
 org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)

 at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)

 at
 org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)

 at
 org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)

 at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)

 at
 org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)

 at
 org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)

 at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)

 at
 org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)

 at org.apache.spark.scheduler.Task.run(Task.scala:64)

 at
 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)

 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

 at java.lang.Thread.run(Thread.java:745)

 On Mon, Jun 29, 2015 at 12:16 PM, Sinthuja Ragendran 
 sinth...@wso2.com wrote:

 Hi nirmal,

 DAS features such as scripts scheduling, purging,etc are used to
 submit the jobs (only spark queries) to external spark cluster, rather
 those DAS 

Re: [Dev] Issues when using DAS features in an external Spark (non-OSGi) environment

2015-06-29 Thread Sinthuja Ragendran
Hi Nirmal,

Thanks for sharing the necessary details. It's due to the data purging
configuration has been enabled in the analytics-conf.xml which uses the
task internally. can you please try to comment the analytics purging
configuration from the repository/conf/analytics/analytics-conf.xml and see?

Thanks,
Sinthuja.

On Mon, Jun 29, 2015 at 1:44 PM, Nirmal Fernando nir...@wso2.com wrote:

 Hi Sinthuja,

 Thanks for the explanation. I think I should have used DAL instead of DAS.
 Yes, so what I talking here is about the DAL features. Exact error is [1]
 and reason for this is TaskService being null. Can you please check?

 [1]

 15/06/28 11:54:51 INFO MemoryStore: Block broadcast_0 stored as values in
 memory (estimated size 3.4 KB, free 265.1 MB)

 15/06/28 11:55:02 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID
 0)

 java.lang.NullPointerException

 at
 org.wso2.carbon.analytics.dataservice.AnalyticsDataServiceImpl.init(AnalyticsDataServiceImpl.java:149)

 at
 org.wso2.carbon.analytics.dataservice.AnalyticsServiceHolder.checkAndPopulateCustomAnalyticsDS(AnalyticsServiceHolder.java:79)

 at
 org.wso2.carbon.analytics.dataservice.AnalyticsServiceHolder.getAnalyticsDataService(AnalyticsServiceHolder.java:67)

 at
 org.wso2.carbon.analytics.spark.core.internal.ServiceHolder.getAnalyticsDataService(ServiceHolder.java:73)

 at
 org.wso2.carbon.analytics.spark.core.util.AnalyticsRDD.compute(AnalyticsRDD.java:81)

 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)

 at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)

 at
 org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)

 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)

 at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)

 at
 org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)

 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)

 at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)

 at
 org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)

 at org.apache.spark.scheduler.Task.run(Task.scala:64)

 at
 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)

 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

 at java.lang.Thread.run(Thread.java:745)

 On Mon, Jun 29, 2015 at 12:16 PM, Sinthuja Ragendran sinth...@wso2.com
 wrote:

 Hi nirmal,

 DAS features such as scripts scheduling, purging,etc are used to submit
 the jobs (only spark queries) to external spark cluster, rather those DAS
 features jars doesn't need to exists within the external spark cluster
 instance. For example, if we consider spark script scheduled execution
 scenario which uses Task OSGI services,  and the task triggering will be be
 occurred wihing DAS node (OSGI env), furthermore when the spark is
 configured externally the job will be handed over to the external cluster,
 and then results will be given back to DAS node. Therefore I don't think
 any of the DAS features jars other than DAL feature jars will be required
 to be inside the external spark cluster.

 Can you please explain more on what is your usecase? And how you have
 configured the setup with DAS features?

 Thanks,
 Sinthuja.


 On Sunday, June 28, 2015, Nirmal Fernando nir...@wso2.com wrote:

 Hi DAS team,

 It appears that we have to think and implement DAS features so that they
 will run even in an non-OSGi environment like an external Spark scenario.
 We have some DAS features which are dependent on Task Service etc. and they
 are failing when we use the from within a Spark job which runs on an
 external Spark cluster.

 How can we solve this?

 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





-- 
*Sinthuja Rajendran*
Associate Technical Lead
WSO2, Inc.:http://wso2.com

Blog: http://sinthu-rajan.blogspot.com/
Mobile: +94774273955
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Issues when using DAS features in an external Spark (non-OSGi) environment

2015-06-29 Thread Sinthuja Ragendran
Hi nirmal,

DAS features such as scripts scheduling, purging,etc are used to submit the
jobs (only spark queries) to external spark cluster, rather those DAS
features jars doesn't need to exists within the external spark cluster
instance. For example, if we consider spark script scheduled execution
scenario which uses Task OSGI services,  and the task triggering will be be
occurred wihing DAS node (OSGI env), furthermore when the spark is
configured externally the job will be handed over to the external cluster,
and then results will be given back to DAS node. Therefore I don't think
any of the DAS features jars other than DAL feature jars will be required
to be inside the external spark cluster.

Can you please explain more on what is your usecase? And how you have
configured the setup with DAS features?

Thanks,
Sinthuja.


On Sunday, June 28, 2015, Nirmal Fernando nir...@wso2.com wrote:

 Hi DAS team,

 It appears that we have to think and implement DAS features so that they
 will run even in an non-OSGi environment like an external Spark scenario.
 We have some DAS features which are dependent on Task Service etc. and they
 are failing when we use the from within a Spark job which runs on an
 external Spark cluster.

 How can we solve this?

 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/



___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Issues when using DAS features in an external Spark (non-OSGi) environment

2015-06-29 Thread Sinthuja Ragendran
Hi Nirmal,

When the purging disabled, if there is already registered purging task then
it'll be deleted, and therefore the it's anyhow required to access the task
service if it's enabled/disabled.

But we can check the existence of task service, and do the analytics
purging related operation if and only if the task service is registered,
with this we can resolve the issue irrespective of above configuration. And
we can log a warn message if the task service is not registered and purging
task is enabled.

@Gihan: I think by default the purging needs to be enabled for the
continuous operation with RDBMS datasource, without too many data being
accumulated in the datasource. Any reason for this to be disabled?

Thanks,
Sinthuja.

On Mon, Jun 29, 2015 at 2:07 PM, Nirmal Fernando nir...@wso2.com wrote:

 That worked Sinthuja! Thanks. However, is it possible to disable the Task
 Service initialization if the purging is disabled (which is the default
 behaviour)?

 analytics-data-purging

   purging-enablefalse/purging-enable

   purge-nodetrue/purge-node

   cron-expression0 0 0 * * ?/cron-expression

   purge-include-table-patterns

  table.*/table

   /purge-include-table-patterns

   data-retention-days365/data-retention-days

/analytics-data-purging

 On Mon, Jun 29, 2015 at 1:57 PM, Sinthuja Ragendran sinth...@wso2.com
 wrote:

 Hi Nirmal,

 Thanks for sharing the necessary details. It's due to the data purging
 configuration has been enabled in the analytics-conf.xml which uses the
 task internally. can you please try to comment the analytics purging
 configuration from the repository/conf/analytics/analytics-conf.xml and see?

 Thanks,
 Sinthuja.

 On Mon, Jun 29, 2015 at 1:44 PM, Nirmal Fernando nir...@wso2.com wrote:

 Hi Sinthuja,

 Thanks for the explanation. I think I should have used DAL instead of
 DAS. Yes, so what I talking here is about the DAL features. Exact error is
 [1] and reason for this is TaskService being null. Can you please check?

 [1]

 15/06/28 11:54:51 INFO MemoryStore: Block broadcast_0 stored as values
 in memory (estimated size 3.4 KB, free 265.1 MB)

 15/06/28 11:55:02 ERROR Executor: Exception in task 0.0 in stage 0.0
 (TID 0)

 java.lang.NullPointerException

 at
 org.wso2.carbon.analytics.dataservice.AnalyticsDataServiceImpl.init(AnalyticsDataServiceImpl.java:149)

 at
 org.wso2.carbon.analytics.dataservice.AnalyticsServiceHolder.checkAndPopulateCustomAnalyticsDS(AnalyticsServiceHolder.java:79)

 at
 org.wso2.carbon.analytics.dataservice.AnalyticsServiceHolder.getAnalyticsDataService(AnalyticsServiceHolder.java:67)

 at
 org.wso2.carbon.analytics.spark.core.internal.ServiceHolder.getAnalyticsDataService(ServiceHolder.java:73)

 at
 org.wso2.carbon.analytics.spark.core.util.AnalyticsRDD.compute(AnalyticsRDD.java:81)

 at
 org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)

 at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)

 at
 org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)

 at
 org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)

 at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)

 at
 org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)

 at
 org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)

 at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)

 at
 org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)

 at org.apache.spark.scheduler.Task.run(Task.scala:64)

 at
 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)

 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

 at java.lang.Thread.run(Thread.java:745)

 On Mon, Jun 29, 2015 at 12:16 PM, Sinthuja Ragendran sinth...@wso2.com
 wrote:

 Hi nirmal,

 DAS features such as scripts scheduling, purging,etc are used to submit
 the jobs (only spark queries) to external spark cluster, rather those DAS
 features jars doesn't need to exists within the external spark cluster
 instance. For example, if we consider spark script scheduled execution
 scenario which uses Task OSGI services,  and the task triggering will be be
 occurred wihing DAS node (OSGI env), furthermore when the spark is
 configured externally the job will be handed over to the external cluster,
 and then results will be given back to DAS node. Therefore I don't think
 any of the DAS features jars other than DAL feature jars will be required
 to be inside the external spark cluster.

 Can you please explain more on what is your usecase? And how you have
 configured the setup with DAS features?

 Thanks,
 Sinthuja.


 On Sunday, June 28, 2015, Nirmal Fernando nir...@wso2.com wrote:

 Hi DAS team,

 It appears that we have to think and 

Re: [Dev] Issues when using DAS features in an external Spark (non-OSGi) environment

2015-06-29 Thread Nirmal Fernando
Hi Sinthuja,

Thanks for the explanation. I think I should have used DAL instead of DAS.
Yes, so what I talking here is about the DAL features. Exact error is [1]
and reason for this is TaskService being null. Can you please check?

[1]

15/06/28 11:54:51 INFO MemoryStore: Block broadcast_0 stored as values in
memory (estimated size 3.4 KB, free 265.1 MB)

15/06/28 11:55:02 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)

java.lang.NullPointerException

at
org.wso2.carbon.analytics.dataservice.AnalyticsDataServiceImpl.init(AnalyticsDataServiceImpl.java:149)

at
org.wso2.carbon.analytics.dataservice.AnalyticsServiceHolder.checkAndPopulateCustomAnalyticsDS(AnalyticsServiceHolder.java:79)

at
org.wso2.carbon.analytics.dataservice.AnalyticsServiceHolder.getAnalyticsDataService(AnalyticsServiceHolder.java:67)

at
org.wso2.carbon.analytics.spark.core.internal.ServiceHolder.getAnalyticsDataService(ServiceHolder.java:73)

at
org.wso2.carbon.analytics.spark.core.util.AnalyticsRDD.compute(AnalyticsRDD.java:81)

at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)

at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)

at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)

at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)

at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)

at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)

at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)

at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)

at
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)

at org.apache.spark.scheduler.Task.run(Task.scala:64)

at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

On Mon, Jun 29, 2015 at 12:16 PM, Sinthuja Ragendran sinth...@wso2.com
wrote:

 Hi nirmal,

 DAS features such as scripts scheduling, purging,etc are used to submit
 the jobs (only spark queries) to external spark cluster, rather those DAS
 features jars doesn't need to exists within the external spark cluster
 instance. For example, if we consider spark script scheduled execution
 scenario which uses Task OSGI services,  and the task triggering will be be
 occurred wihing DAS node (OSGI env), furthermore when the spark is
 configured externally the job will be handed over to the external cluster,
 and then results will be given back to DAS node. Therefore I don't think
 any of the DAS features jars other than DAL feature jars will be required
 to be inside the external spark cluster.

 Can you please explain more on what is your usecase? And how you have
 configured the setup with DAS features?

 Thanks,
 Sinthuja.


 On Sunday, June 28, 2015, Nirmal Fernando nir...@wso2.com wrote:

 Hi DAS team,

 It appears that we have to think and implement DAS features so that they
 will run even in an non-OSGi environment like an external Spark scenario.
 We have some DAS features which are dependent on Task Service etc. and they
 are failing when we use the from within a Spark job which runs on an
 external Spark cluster.

 How can we solve this?

 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





-- 

Thanks  regards,
Nirmal

Associate Technical Lead - Data Technologies Team, WSO2 Inc.
Mobile: +94715779733
Blog: http://nirmalfdo.blogspot.com/
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Issues when using DAS features in an external Spark (non-OSGi) environment

2015-06-29 Thread Nirmal Fernando
That worked Sinthuja! Thanks. However, is it possible to disable the Task
Service initialization if the purging is disabled (which is the default
behaviour)?

analytics-data-purging

  purging-enablefalse/purging-enable

  purge-nodetrue/purge-node

  cron-expression0 0 0 * * ?/cron-expression

  purge-include-table-patterns

 table.*/table

  /purge-include-table-patterns

  data-retention-days365/data-retention-days

   /analytics-data-purging

On Mon, Jun 29, 2015 at 1:57 PM, Sinthuja Ragendran sinth...@wso2.com
wrote:

 Hi Nirmal,

 Thanks for sharing the necessary details. It's due to the data purging
 configuration has been enabled in the analytics-conf.xml which uses the
 task internally. can you please try to comment the analytics purging
 configuration from the repository/conf/analytics/analytics-conf.xml and see?

 Thanks,
 Sinthuja.

 On Mon, Jun 29, 2015 at 1:44 PM, Nirmal Fernando nir...@wso2.com wrote:

 Hi Sinthuja,

 Thanks for the explanation. I think I should have used DAL instead of
 DAS. Yes, so what I talking here is about the DAL features. Exact error is
 [1] and reason for this is TaskService being null. Can you please check?

 [1]

 15/06/28 11:54:51 INFO MemoryStore: Block broadcast_0 stored as values in
 memory (estimated size 3.4 KB, free 265.1 MB)

 15/06/28 11:55:02 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID
 0)

 java.lang.NullPointerException

 at
 org.wso2.carbon.analytics.dataservice.AnalyticsDataServiceImpl.init(AnalyticsDataServiceImpl.java:149)

 at
 org.wso2.carbon.analytics.dataservice.AnalyticsServiceHolder.checkAndPopulateCustomAnalyticsDS(AnalyticsServiceHolder.java:79)

 at
 org.wso2.carbon.analytics.dataservice.AnalyticsServiceHolder.getAnalyticsDataService(AnalyticsServiceHolder.java:67)

 at
 org.wso2.carbon.analytics.spark.core.internal.ServiceHolder.getAnalyticsDataService(ServiceHolder.java:73)

 at
 org.wso2.carbon.analytics.spark.core.util.AnalyticsRDD.compute(AnalyticsRDD.java:81)

 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)

 at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)

 at
 org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)

 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)

 at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)

 at
 org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)

 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)

 at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)

 at
 org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)

 at org.apache.spark.scheduler.Task.run(Task.scala:64)

 at
 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)

 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

 at java.lang.Thread.run(Thread.java:745)

 On Mon, Jun 29, 2015 at 12:16 PM, Sinthuja Ragendran sinth...@wso2.com
 wrote:

 Hi nirmal,

 DAS features such as scripts scheduling, purging,etc are used to submit
 the jobs (only spark queries) to external spark cluster, rather those DAS
 features jars doesn't need to exists within the external spark cluster
 instance. For example, if we consider spark script scheduled execution
 scenario which uses Task OSGI services,  and the task triggering will be be
 occurred wihing DAS node (OSGI env), furthermore when the spark is
 configured externally the job will be handed over to the external cluster,
 and then results will be given back to DAS node. Therefore I don't think
 any of the DAS features jars other than DAL feature jars will be required
 to be inside the external spark cluster.

 Can you please explain more on what is your usecase? And how you have
 configured the setup with DAS features?

 Thanks,
 Sinthuja.


 On Sunday, June 28, 2015, Nirmal Fernando nir...@wso2.com wrote:

 Hi DAS team,

 It appears that we have to think and implement DAS features so that
 they will run even in an non-OSGi environment like an external Spark
 scenario. We have some DAS features which are dependent on Task Service
 etc. and they are failing when we use the from within a Spark job which
 runs on an external Spark cluster.

 How can we solve this?

 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





 --
 *Sinthuja Rajendran*
 Associate Technical Lead
 WSO2, Inc.:http://wso2.com

 Blog: http://sinthu-rajan.blogspot.com/
 Mobile: +94774273955





-- 

Thanks  regards,
Nirmal

Associate Technical Lead - 

Re: [Dev] Issues when using DAS features in an external Spark (non-OSGi) environment

2015-06-29 Thread Gihan Anuruddha
Hi Sinthuja,

Yes, by default data purging is disabled. It is a dev ops decision to
enable or disable the purging task and come-up with suitable input like
retention period, purge enabled tables etc.
Regards,
Gihan

On Mon, Jun 29, 2015 at 2:38 PM, Sinthuja Ragendran sinth...@wso2.com
wrote:

 Hi Nirmal,

 When the purging disabled, if there is already registered purging task
 then it'll be deleted, and therefore the it's anyhow required to access the
 task service if it's enabled/disabled.

 But we can check the existence of task service, and do the analytics
 purging related operation if and only if the task service is registered,
 with this we can resolve the issue irrespective of above configuration. And
 we can log a warn message if the task service is not registered and purging
 task is enabled.

 @Gihan: I think by default the purging needs to be enabled for the
 continuous operation with RDBMS datasource, without too many data being
 accumulated in the datasource. Any reason for this to be disabled?

 Thanks,
 Sinthuja.

 On Mon, Jun 29, 2015 at 2:07 PM, Nirmal Fernando nir...@wso2.com wrote:

 That worked Sinthuja! Thanks. However, is it possible to disable the Task
 Service initialization if the purging is disabled (which is the default
 behaviour)?

 analytics-data-purging

   purging-enablefalse/purging-enable

   purge-nodetrue/purge-node

   cron-expression0 0 0 * * ?/cron-expression

   purge-include-table-patterns

  table.*/table

   /purge-include-table-patterns

   data-retention-days365/data-retention-days

/analytics-data-purging

 On Mon, Jun 29, 2015 at 1:57 PM, Sinthuja Ragendran sinth...@wso2.com
 wrote:

 Hi Nirmal,

 Thanks for sharing the necessary details. It's due to the data purging
 configuration has been enabled in the analytics-conf.xml which uses the
 task internally. can you please try to comment the analytics purging
 configuration from the repository/conf/analytics/analytics-conf.xml and see?

 Thanks,
 Sinthuja.

 On Mon, Jun 29, 2015 at 1:44 PM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Sinthuja,

 Thanks for the explanation. I think I should have used DAL instead of
 DAS. Yes, so what I talking here is about the DAL features. Exact error is
 [1] and reason for this is TaskService being null. Can you please check?

 [1]

 15/06/28 11:54:51 INFO MemoryStore: Block broadcast_0 stored as values
 in memory (estimated size 3.4 KB, free 265.1 MB)

 15/06/28 11:55:02 ERROR Executor: Exception in task 0.0 in stage 0.0
 (TID 0)

 java.lang.NullPointerException

 at
 org.wso2.carbon.analytics.dataservice.AnalyticsDataServiceImpl.init(AnalyticsDataServiceImpl.java:149)

 at
 org.wso2.carbon.analytics.dataservice.AnalyticsServiceHolder.checkAndPopulateCustomAnalyticsDS(AnalyticsServiceHolder.java:79)

 at
 org.wso2.carbon.analytics.dataservice.AnalyticsServiceHolder.getAnalyticsDataService(AnalyticsServiceHolder.java:67)

 at
 org.wso2.carbon.analytics.spark.core.internal.ServiceHolder.getAnalyticsDataService(ServiceHolder.java:73)

 at
 org.wso2.carbon.analytics.spark.core.util.AnalyticsRDD.compute(AnalyticsRDD.java:81)

 at
 org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)

 at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)

 at
 org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)

 at
 org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)

 at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)

 at
 org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)

 at
 org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)

 at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)

 at
 org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)

 at org.apache.spark.scheduler.Task.run(Task.scala:64)

 at
 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)

 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

 at java.lang.Thread.run(Thread.java:745)

 On Mon, Jun 29, 2015 at 12:16 PM, Sinthuja Ragendran sinth...@wso2.com
  wrote:

 Hi nirmal,

 DAS features such as scripts scheduling, purging,etc are used to
 submit the jobs (only spark queries) to external spark cluster, rather
 those DAS features jars doesn't need to exists within the external spark
 cluster instance. For example, if we consider spark script scheduled
 execution scenario which uses Task OSGI services,  and the task triggering
 will be be occurred wihing DAS node (OSGI env), furthermore when the spark
 is configured externally the job will be handed over to the external
 cluster, and then results will be given back to DAS node. Therefore I 
 don't
 think any of the DAS features jars other than DAL feature jars will be