I definitely re-built both and I am 100% sure that the AsterixIndexInsertDeleteNodePushable has the bug. where? not sure but most likely hidden somewhere in the storage layer.
Tomorrow, I am going to check each of the components in that operator 1 by 1 until I can isolate the source of the bug. Cheers, Abdullah. Amoudi, Abdullah. On Wed, Nov 11, 2015 at 11:27 PM, Jianfeng Jia <[email protected]> wrote: > Then that will be two different issues. > Just want to make sure that you’ve rebuilt the hyracks (not only > asterixdb) before test your code, cause those changes are in hyracks. > And could you send the logic plan and the hyrack job so that we can lock > which hyracks operators that get involved? > > > On Nov 11, 2015, at 12:10 PM, abdullah alamoudi <[email protected]> > wrote: > > > > That was my first thought as I said but I am 100% sure the issue is not > in > > the SerDe. To confirm this, I removed the reader and writer from the > serde > > and created a new instance of reader/writer in every call to serialize or > > deserialize just to determine if the problem is gone. > > > > The problem didn't go away and I still had the same issue. That is why I > > know for sure it is not the SerDe. > > > > Don't waste any more time in that direction. > > ~Abdullah. > > > > Amoudi, Abdullah. > > > > On Wed, Nov 11, 2015 at 10:54 PM, Jianfeng Jia <[email protected]> > > wrote: > > > >> Here is my finding and thoughts. > >> I think I’ve checked all the direct use case of UTF8SerDer. However, I > >> missed some indirect static/shared use case of UTF8SerDer. > >> > >> One big suspect is the RecordDescriptor which has the > >> ISerializerDeserializers inside and is always passed into the Factory > >> method and shared by the ThreadMethod (usually NodePushable). > >> E.g., in the ResultWriterOperatorDescriptor, the outRecordDesc is passed > >> to the createPushRuntime() factory method to create the > “resultSerializer”, > >> and it is shared by the thread object > >> AbstractUnaryInputSinkOperatorNodePushable. This pushable object will > >> directly get the deserializer from the shared > >> recordDescpitor.getFields()[i]. It explains the issue-1164. > >> > >> I guess in your case there must be some deserializers given by shared > >> RecordDescriptor. Then it will get into the racing condition if there > are > >> some UTF8StringSerDer involved. > >> > >> Given that the SerDers are stored in the shared RecordDescriptor, I > think > >> the very initial design was to make the all the SerDers thread-safe. > And it > >> maybe some other data structures stores the SerDers and are passed/used > in > >> a same way. Then I’d have to propose to roll back the UTF8SerDer into > the > >> state-less version (at the expense of creating intermediate buffer array > >> per record). > >> > >> Any opinions? > >> > >> > >>> On Nov 11, 2015, at 10:54 AM, abdullah alamoudi <[email protected]> > >> wrote: > >>> > >>> That was my first thought and so I changed it. The issue is still > there. > >>> I am also using the UTF8StringSerializerDeserializer to deserialize the > >>> strings and they always serialize it correctly. > >>> > >>> I am thinking maybe it is related to the UTF8StringPointable but I am > not > >>> sure how that could be. > >>> I am looking at this as well, > >>> Abdullah. > >>> > >>> Amoudi, Abdullah. > >>> > >>> On Wed, Nov 11, 2015 at 8:05 PM, Jianfeng Jia <[email protected]> > >>> wrote: > >>> > >>>> The possible racing condition could be that the > >>>> UTF8StringSerializerDeserializer now is not a singleton method any > >> more. It > >>>> was implemented to reuse the byte[] that serialize/deserialize the > >> string > >>>> object. Let me look into this issue. > >>>> > >>>>> On Nov 11, 2015, at 8:37 AM, abdullah alamoudi <[email protected]> > >>>> wrote: > >>>>> > >>>>> Highly probable. > >>>>> Please, let's fix this soon. > >>>>> > >>>>> Amoudi, Abdullah. > >>>>> > >>>>> On Wed, Nov 11, 2015 at 7:32 PM, Till Westmann <[email protected]> > >> wrote: > >>>>> > >>>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1164 > >>>>>> might be related. > >>>>>> > >>>>>> Cheers, > >>>>>> Till > >>>>>> > >>>>>> On 11 Nov 2015, at 8:25, abdullah alamoudi wrote: > >>>>>> > >>>>>>> Hi all, > >>>>>>> I am having a hard time figuring this out. Here are the symptoms I > am > >>>>>>> seeing in case one has an idea what this could be. > >>>>>>> > >>>>>>> I have a feed running ingesting data into a dataset. sporadically, > I > >>>> get > >>>>>>> duplicate key exception errors (The key is of a string type) and I > am > >>>>>> 100% > >>>>>>> sure that I don't have duplicate records. > >>>>>>> > >>>>>>> Moreover, I am printing the content of the frames about to be > >> inserted > >>>>>> into > >>>>>>> the primary index and there are no duplicate records. > >>>>>>> > >>>>>>> There are three reasons why I am suspecting the String > >> implementation: > >>>>>>> 1. It is fairly recent change. > >>>>>>> 2. When I run on a single node, or run one thread at a time, I > never > >>>> get > >>>>>>> this exception. > >>>>>>> 3. the key is a String. > >>>>>>> > >>>>>>> I have looked at the change trying to figure out where a race > >> condition > >>>>>>> might take place but it is well hidden (if it is true at all.). > >>>>>>> > >>>>>>> Let me know if you have seen something similar. > >>>>>>> > >>>>>>> Cheers, > >>>>>>> Abdullah. > >>>>>> > >>>> > >>>> > >>>> > >>>> Best, > >>>> > >>>> Jianfeng Jia > >>>> PhD Candidate of Computer Science > >>>> University of California, Irvine > >>>> > >>>> > >> > >> > >> > >> Best, > >> > >> Jianfeng Jia > >> PhD Candidate of Computer Science > >> University of California, Irvine > >> > >> > > > > Best, > > Jianfeng Jia > PhD Candidate of Computer Science > University of California, Irvine > >
