Hi Gihan,

IMHO the recommended behaviour of the server needs to be as default
configuration, and since our recommendation is configure purging to have
smooth operation without getting accumulated with high amount of data in
the data store. Therefore can we have some reasonable configuration for
purging enabled with data retention period is 3 months/ 90 days, and run
the purging task every week?

Thanks,
Sinthuja.

On Mon, Jun 29, 2015 at 2:46 PM, Gihan Anuruddha <[email protected]> wrote:

> Hi Sinthuja,
>
> Yes, by default data purging is disabled. It is a dev ops decision to
> enable or disable the purging task and come-up with suitable input like
> retention period, purge enabled tables etc.
> Regards,
> Gihan
>
> On Mon, Jun 29, 2015 at 2:38 PM, Sinthuja Ragendran <[email protected]>
> wrote:
>
>> Hi Nirmal,
>>
>> When the purging disabled, if there is already registered purging task
>> then it'll be deleted, and therefore the it's anyhow required to access the
>> task service if it's enabled/disabled.
>>
>> But we can check the existence of task service, and do the analytics
>> purging related operation if and only if the task service is registered,
>> with this we can resolve the issue irrespective of above configuration. And
>> we can log a warn message if the task service is not registered and purging
>> task is enabled.
>>
>> @Gihan: I think by default the purging needs to be enabled for the
>> continuous operation with RDBMS datasource, without too many data being
>> accumulated in the datasource. Any reason for this to be disabled?
>>
>> Thanks,
>> Sinthuja.
>>
>> On Mon, Jun 29, 2015 at 2:07 PM, Nirmal Fernando <[email protected]> wrote:
>>
>>> That worked Sinthuja! Thanks. However, is it possible to disable the
>>> Task Service initialization if the purging is disabled (which is the
>>> default behaviour)?
>>>
>>> <analytics-data-purging>
>>>
>>>       <purging-enable>false</purging-enable>
>>>
>>>       <purge-node>true</purge-node>
>>>
>>>       <cron-expression>0 0 0 * * ?</cron-expression>
>>>
>>>       <purge-include-table-patterns>
>>>
>>>          <table>.*</table>
>>>
>>>       </purge-include-table-patterns>
>>>
>>>       <data-retention-days>365</data-retention-days>
>>>
>>>    </analytics-data-purging>
>>>
>>> On Mon, Jun 29, 2015 at 1:57 PM, Sinthuja Ragendran <[email protected]>
>>> wrote:
>>>
>>>> Hi Nirmal,
>>>>
>>>> Thanks for sharing the necessary details. It's due to the data purging
>>>> configuration has been enabled in the analytics-conf.xml which uses the
>>>> task internally. can you please try to comment the analytics purging
>>>> configuration from the repository/conf/analytics/analytics-conf.xml and 
>>>> see?
>>>>
>>>> Thanks,
>>>> Sinthuja.
>>>>
>>>> On Mon, Jun 29, 2015 at 1:44 PM, Nirmal Fernando <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi Sinthuja,
>>>>>
>>>>> Thanks for the explanation. I think I should have used DAL instead of
>>>>> DAS. Yes, so what I talking here is about the DAL features. Exact error is
>>>>> [1] and reason for this is TaskService being null. Can you please check?
>>>>>
>>>>> [1]
>>>>>
>>>>> 15/06/28 11:54:51 INFO MemoryStore: Block broadcast_0 stored as values
>>>>> in memory (estimated size 3.4 KB, free 265.1 MB)
>>>>>
>>>>> 15/06/28 11:55:02 ERROR Executor: Exception in task 0.0 in stage 0.0
>>>>> (TID 0)
>>>>>
>>>>> java.lang.NullPointerException
>>>>>
>>>>>         at
>>>>> org.wso2.carbon.analytics.dataservice.AnalyticsDataServiceImpl.<init>(AnalyticsDataServiceImpl.java:149)
>>>>>
>>>>>         at
>>>>> org.wso2.carbon.analytics.dataservice.AnalyticsServiceHolder.checkAndPopulateCustomAnalyticsDS(AnalyticsServiceHolder.java:79)
>>>>>
>>>>>         at
>>>>> org.wso2.carbon.analytics.dataservice.AnalyticsServiceHolder.getAnalyticsDataService(AnalyticsServiceHolder.java:67)
>>>>>
>>>>>         at
>>>>> org.wso2.carbon.analytics.spark.core.internal.ServiceHolder.getAnalyticsDataService(ServiceHolder.java:73)
>>>>>
>>>>>         at
>>>>> org.wso2.carbon.analytics.spark.core.util.AnalyticsRDD.compute(AnalyticsRDD.java:81)
>>>>>
>>>>>         at
>>>>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>>>>>
>>>>>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>>>>>
>>>>>         at
>>>>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>>>>>
>>>>>         at
>>>>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>>>>>
>>>>>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>>>>>
>>>>>         at
>>>>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>>>>>
>>>>>         at
>>>>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>>>>>
>>>>>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>>>>>
>>>>>         at
>>>>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>>>>>
>>>>>         at org.apache.spark.scheduler.Task.run(Task.scala:64)
>>>>>
>>>>>         at
>>>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
>>>>>
>>>>>         at
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>>
>>>>>         at
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>>
>>>>>         at java.lang.Thread.run(Thread.java:745)
>>>>>
>>>>> On Mon, Jun 29, 2015 at 12:16 PM, Sinthuja Ragendran <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hi nirmal,
>>>>>>
>>>>>> DAS features such as scripts scheduling, purging,etc are used to
>>>>>> submit the jobs (only spark queries) to external spark cluster, rather
>>>>>> those DAS features jars doesn't need to exists within the external spark
>>>>>> cluster instance. For example, if we consider spark script scheduled
>>>>>> execution scenario which uses Task OSGI services,  and the task 
>>>>>> triggering
>>>>>> will be be occurred wihing DAS node (OSGI env), furthermore when the 
>>>>>> spark
>>>>>> is configured externally the job will be handed over to the external
>>>>>> cluster, and then results will be given back to DAS node. Therefore I 
>>>>>> don't
>>>>>> think any of the DAS features jars other than DAL feature jars will be
>>>>>> required to be inside the external spark cluster.
>>>>>>
>>>>>> Can you please explain more on what is your usecase? And how you have
>>>>>> configured the setup with DAS features?
>>>>>>
>>>>>> Thanks,
>>>>>> Sinthuja.
>>>>>>
>>>>>>
>>>>>> On Sunday, June 28, 2015, Nirmal Fernando <[email protected]> wrote:
>>>>>>
>>>>>>> Hi DAS team,
>>>>>>>
>>>>>>> It appears that we have to think and implement DAS features so that
>>>>>>> they will run even in an non-OSGi environment like an external Spark
>>>>>>> scenario. We have some DAS features which are dependent on Task Service
>>>>>>> etc. and they are failing when we use the from within a Spark job which
>>>>>>> runs on an external Spark cluster.
>>>>>>>
>>>>>>> How can we solve this?
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Thanks & regards,
>>>>>>> Nirmal
>>>>>>>
>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>>>>> Mobile: +94715779733
>>>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Thanks & regards,
>>>>> Nirmal
>>>>>
>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>>> Mobile: +94715779733
>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> *Sinthuja Rajendran*
>>>> Associate Technical Lead
>>>> WSO2, Inc.:http://wso2.com
>>>>
>>>> Blog: http://sinthu-rajan.blogspot.com/
>>>> Mobile: +94774273955
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Thanks & regards,
>>> Nirmal
>>>
>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>> Mobile: +94715779733
>>> Blog: http://nirmalfdo.blogspot.com/
>>>
>>>
>>>
>>
>>
>> --
>> *Sinthuja Rajendran*
>> Associate Technical Lead
>> WSO2, Inc.:http://wso2.com
>>
>> Blog: http://sinthu-rajan.blogspot.com/
>> Mobile: +94774273955
>>
>>
>>
>
>
> --
> W.G. Gihan Anuruddha
> Senior Software Engineer | WSO2, Inc.
> M: +94772272595
>



-- 
*Sinthuja Rajendran*
Associate Technical Lead
WSO2, Inc.:http://wso2.com

Blog: http://sinthu-rajan.blogspot.com/
Mobile: +94774273955
_______________________________________________
Dev mailing list
[email protected]
http://wso2.org/cgi-bin/mailman/listinfo/dev

Reply via email to