But when I run the command "count 'pio_event:events' " on hbase it shows me all the row 1.5 million
On Thu, Nov 23, 2017 at 2:53 PM, Александр Лактионов <[email protected]> wrote: > Hi Abhimanyu, > > try setting TTL for rows in your hbase table > it can be set in hbase shell: > alter 'pio_event:events_?', NAME => 'e', TTL => <seconds to live> > and then do the following in the shell: > major_compact 'pio_event:events_?' > > You can configure auto major compact: it will delete all the rows that are > older than TTL > > 23 нояб. 2017 г., в 12:19, Abhimanyu Nagrath <[email protected]> > написал(а): > > Hi, > > I am stuck at this point .How to identify the problem? > > > Regards, > Abhimanyu > > On Mon, Nov 20, 2017 at 11:08 AM, Abhimanyu Nagrath < > [email protected]> wrote: > >> Hi , I am new to predictionIO V 0.12.0 (elasticsearch - 5.2.1 , hbase - >> 1.2.6 , spark - 2.6.0) Hardware (244 GB RAM and Core - 32) . I have >> uploaded near about 1 million events(each containing 30k features) . while >> uploading I can see the size of hbase disk increasing and after all the >> events got uploaded the size of hbase disk is 567GB. In order to verify I >> ran the following commands >> >> - pio-shell --with-spark --conf spark.network.timeout=10000000 >> --driver-memory 30G --executor-memory 21G --num-executors 7 >> --executor-cores 3 --conf spark.driver.maxResultSize=4g --conf >> spark.executor.heartbeatInterval=10000000 >> - import org.apache.predictionio.data.store.PEventStore >> - val eventsRDD = PEventStore.find(appName="test")(sc) >> - val c = eventsRDD.count() >> it shows event counts as 18944 >> >> After that from the script through which I uploaded the events, I >> randomly queried with there events Id and I was getting that event. >> >> I don't know how to make sure that all the events uploaded by me are >> there in the app. Any help is appreciated. >> >> >> Regards, >> Abhimanyu >> > > >
