@Abdullah, any luck on this issue? Thanks and happy new year! > On Nov 30, 2015, at 5:24 PM, abdullah alamoudi <[email protected]> wrote: > > I know exactly what is going on here. The problem is you pointed out is > caused by the duplicate keys. If I remember correctly, the main issue is > that locks that are placed on the primary keys are not released. > > I will start fixing this issue tonight. > Cheers, > Abdullah. > > Amoudi, Abdullah. > > On Mon, Nov 30, 2015 at 4:52 PM, Jianfeng Jia <[email protected]> > wrote: > >> Dear devs, >> >> I hit an wield issue that is reproducible, but only if the data has >> duplications and also is large enough. Let me explained it step by step: >> >> 1. The dataset is very simple that only has two fields. >> DDL AQL: >> ————————————— >> drop dataverse test if exists; >> create dataverse test; >> use dataverse test; >> >> create type t_test as closed{ >> fa: int64, >> fb : int64 >> } >> >> create dataset ds_test(t_test) primary key fa; >> >> create feed fd_test using socket_adapter >> ( >> ("sockets"="nc1:10001"), >> ("address-type"="nc"), >> ("type-name"="t_test"), >> ("format"="adm"), >> ("duration"="1200") >> ); >> >> set wait-for-completion-feed "false"; >> connect feed fd_test to dataset ds_test using policy AdvancedFT_Discard; >> >> —————————————————————————————— >> >> That AdvancedFT_Discard policy will ignore the exception from the >> insertion and keep ingesting. >> >> 2. Ingesting the data by a very simple socked adapter which reads the >> record one by one from an adm file. The src is here: >> https://github.com/JavierJia/twitter-tracker/blob/master/src/main/java/edu/uci/ics/twitter/asterix/feed/FileFeedSocketAdapterClient.java >> The data and the app package is provided here: >> https://drive.google.com/folderview?id=0B423M7wGZj9dYVQ1TkpBNzcwSlE&usp=sharing >> To feed the data you can run: >> >> ./bin/feedFile -u 172.17.0.2 -p 10001 -c 5000000 ~/data/twitter/test.adm >> >> -u for sever url >> -p for server port >> -c for count of line you want to ingest >> >> 3. After ingestion, all the requests about the ds_test was hanging. There >> is no exception and no responds for hours. However it can respond any other >> queries that on other datasets, like Metadata. >> >> That data contains some duplicated records which should trigger the insert >> exception. If I change the count from 5000000 to lower, let’s say 3000000, >> it has no problems, although it contains duplications as well. >> >> Any feed experts have any hint on which part could be wrong? cc and nc log >> was attached. Thank you! >> >> >> >> >> >> >> Best, >> >> Jianfeng Jia >> PhD Candidate of Computer Science >> University of California, Irvine >> >> >>
Best, Jianfeng Jia PhD Candidate of Computer Science University of California, Irvine
