Therein lies the value of eating our own dogfood early on!
(So that non-team-member dogs don't die of either food poisoning or starvation or choking. :-))

On 12/1/15 10:03 AM, Chen Li wrote:
Cool.  The prototype Jianfeng is building revealed quite a few issues in
the system :-)

On Mon, Nov 30, 2015 at 5:24 PM, abdullah alamoudi <[email protected]>
wrote:

I know exactly what is going on here. The problem is you pointed out is
caused by the duplicate keys. If I remember correctly, the main issue is
that locks that are placed on the primary keys are not released.

I will start fixing this issue tonight.
Cheers,
Abdullah.

Amoudi, Abdullah.

On Mon, Nov 30, 2015 at 4:52 PM, Jianfeng Jia <[email protected]>
wrote:

Dear devs,

I hit an wield issue that is reproducible, but only if the data has
duplications and also is large enough. Let me explained it step by step:

1. The dataset is very simple that only has two fields.
DDL AQL:
—————————————
drop dataverse test if exists;
create dataverse test;
use dataverse test;

create type t_test as closed{
   fa: int64,
   fb : int64
}

create dataset ds_test(t_test) primary key fa;

create feed fd_test using socket_adapter
(
     ("sockets"="nc1:10001"),
     ("address-type"="nc"),
     ("type-name"="t_test"),
     ("format"="adm"),
     ("duration"="1200")
);

set wait-for-completion-feed "false";
connect feed fd_test to dataset ds_test using policy AdvancedFT_Discard;

——————————————————————————————

That AdvancedFT_Discard policy will ignore the exception from the
insertion and keep ingesting.

2. Ingesting the data by a very simple socked adapter which reads the
record one by one from an adm file. The src is here:

https://github.com/JavierJia/twitter-tracker/blob/master/src/main/java/edu/uci/ics/twitter/asterix/feed/FileFeedSocketAdapterClient.java
The data and the app package is provided here:

https://drive.google.com/folderview?id=0B423M7wGZj9dYVQ1TkpBNzcwSlE&usp=sharing
To feed the data you can run:

./bin/feedFile -u 172.17.0.2 -p 10001 -c 5000000 ~/data/twitter/test.adm

-u for sever url
-p for server port
-c for count of line you want to ingest

3. After ingestion, all the requests about the ds_test was hanging. There
is no exception and no responds for hours. However it can respond any
other
queries that on other datasets, like Metadata.

That data contains some duplicated records which should trigger the
insert
exception. If I change the count from 5000000 to lower, let’s say
3000000,
it has no problems, although it contains duplications as well.

Any feed experts have any hint on which part could be wrong? cc and nc
log
was attached. Thank you!






Best,

Jianfeng Jia
PhD Candidate of Computer Science
University of California, Irvine




Reply via email to