Hi Abhimanyu,

try setting TTL for rows in your hbase table
it can be set in hbase shell:
        alter 'pio_event:events_?', NAME => 'e', TTL => <seconds to live>
and then do the following in the shell:
        major_compact 'pio_event:events_?'

You can configure auto major compact: it will delete all the rows that are 
older than TTL

> 23 нояб. 2017 г., в 12:19, Abhimanyu Nagrath <[email protected]> 
> написал(а):
> 
> Hi,
> 
> I am stuck at this point .How to identify the problem?
> 
> 
> Regards,
> Abhimanyu
> 
> On Mon, Nov 20, 2017 at 11:08 AM, Abhimanyu Nagrath 
> <[email protected] <mailto:[email protected]>> wrote:
> Hi , I am new to predictionIO V 0.12.0 (elasticsearch - 5.2.1 , hbase - 1.2.6 
> , spark - 2.6.0) Hardware (244 GB RAM and Core - 32) . I have uploaded near 
> about 1 million events(each containing 30k features) . while uploading I can 
> see the size of hbase disk increasing and after all the events got uploaded 
> the size of hbase disk is 567GB. In order to verify I ran the following 
> commands 
> 
>  - pio-shell --with-spark --conf spark.network.timeout=10000000 
> --driver-memory 30G --executor-memory 21G --num-executors 7 --executor-cores 
> 3 --conf spark.driver.maxResultSize=4g --conf 
> spark.executor.heartbeatInterval=10000000
>  - import org.apache.predictionio.data.store.PEventStore
>  - val eventsRDD = PEventStore.find(appName="test")(sc)
>  - val c = eventsRDD.count() 
> it shows event counts as 18944
> 
> After that from the script through which I uploaded the events, I randomly 
> queried with there events Id and I was getting that event.
> 
> I don't know how to make sure that all the events uploaded by me are there in 
> the app. Any help is appreciated.
> 
> 
> Regards,
> Abhimanyu
> 

Reply via email to