Looks like an interesting case. Just a small question. Are you sure a spatial index is the right one to use here? The spatial attribute looks more like a categorization and a hash or B-tree index could be more suitable. As far as I know, the spatial index in AsterixDB is a secondary R-tree index which, like any other secondary index, is only good for retrieving a small number of records. For this dataset, it seems that any small range would still return a huge number of records.
It is still interesting to further investigate and fix the sort issue but I mentioned the usage issue for a different perspective. Thanks Ahmed On Wed, Sep 14, 2016 at 10:30 AM Mike Carey <dtab...@gmail.com> wrote: > ☺! > > On Sep 14, 2016 1:11 AM, "Wail Alkowaileet" <wael....@gmail.com> wrote: > > > To be exact > > I have 2,255,091,590 records and 10,391 points :-) > > > > On Wed, Sep 14, 2016 at 10:46 AM, Mike Carey <dtab...@gmail.com> wrote: > > > > > Thx! I knew I'd meant to "activate" the thought somehow, but couldn't > > > remember having done it for sure. Oops! Scattered from VLDB, I > guess...! > > > > > > > > > > > > On 9/13/16 9:58 PM, Taewoo Kim wrote: > > > > > >> @Mike: You filed an issue - > > >> https://issues.apache.org/jira/browse/ASTERIXDB-1639. :-) > > >> > > >> Best, > > >> Taewoo > > >> > > >> On Tue, Sep 13, 2016 at 9:28 PM, Mike Carey <dtab...@gmail.com> > wrote: > > >> > > >> I can't remember (slight jetlag? :-)) if I shared back to this list > one > > >>> theory that came up in India when Wail and I talked F2F - his data > has > > a > > >>> lot of duplicate points, so maybe something goes awry in that case. > I > > >>> wonder if we've sufficiently tested that case? (E.g., what if there > > are > > >>> gazillions of records originating from a small handful of points?) > > >>> > > >>> > > >>> On 8/26/16 9:55 AM, Taewoo Kim wrote: > > >>> > > >>> Based on a rough calculation, per partition, each point field takes > > 3.6GB > > >>>> (16 bytes * 2887453794 records / 12 partition). To sort 3.6GB, we > are > > >>>> generating 625 files (96MB or 128MB each) = 157GB. Since Wail > > mentioned > > >>>> that there was no issue when creating a B+ tree index, we need to > > check > > >>>> what SORT process is required by R-Tree index. > > >>>> > > >>>> Best, > > >>>> Taewoo > > >>>> > > >>>> On Fri, Aug 26, 2016 at 7:52 AM, Jianfeng Jia < > jianfeng....@gmail.com > > > > > >>>> wrote: > > >>>> > > >>>> If all of the file names start with “ExternalSortRunGenerator”, then > > >>>> they > > >>>> > > >>>>> are the first round files which can not be GCed. > > >>>>> Could you provide the query plan as well? > > >>>>> > > >>>>> On Aug 24, 2016, at 10:02 PM, Wail Alkowaileet <wael....@gmail.com > > > > >>>>> wrote: > > >>>>> > > >>>>> Hi Ian and Pouria, > > >>>>>> > > >>>>>> The name of the files along with the sizes (there were 625 one of > > >>>>>> those > > >>>>>> before crashing): > > >>>>>> > > >>>>>> size name > > >>>>>> 96MB ExternalSortRunGenerator8917133039835449370.waf > > >>>>>> 128MB ExternalSortRunGenerator8948724728025392343.waf > > >>>>>> > > >>>>>> no files were generated beyond runs. > > >>>>>> compiler.sortmemory = 64MB > > >>>>>> > > >>>>>> Here is the full logs > > >>>>>> <https://www.dropbox.com/s/k2qbo3wybc8mnnk/log_Thu_Aug_ > > >>>>>> > > >>>>>> 25_07%3A34%3A52_AST_2016.zip?dl=0> > > >>>>> > > >>>>> On Tue, Aug 23, 2016 at 9:29 PM, Pouria Pirzadeh < > > >>>>>> > > >>>>>> pouria.pirza...@gmail.com> > > >>>>> > > >>>>> wrote: > > >>>>>> > > >>>>>> We previously had issues with huge spilled sort temp files when > > >>>>>> creating > > >>>>>> > > >>>>>>> inverted index for fuzzy queries, but NOT R-Trees. > > >>>>>>> I also recall that Yingyi fixed the issue of delaying clean-up > for > > >>>>>>> intermediate temp files until the end of the query execution. > > >>>>>>> If you can share names of a couple of temp files (and their sizes > > >>>>>>> along > > >>>>>>> with the sort memory setting you have in > asterix-configuration.xml) > > >>>>>>> we > > >>>>>>> > > >>>>>>> may > > >>>>>> be able to have a better guess as if the sort is really going > into a > > >>>>>> > > >>>>>>> two-level merge or not. > > >>>>>>> > > >>>>>>> Pouria > > >>>>>>> > > >>>>>>> On Tue, Aug 23, 2016 at 11:09 AM, Ian Maxon <ima...@uci.edu> > > wrote: > > >>>>>>> > > >>>>>>> I think that execption ("No space left on device") is just casted > > >>>>>>> from > > >>>>>>> the > > >>>>>>> > > >>>>>>> native IOException. Therefore I would be inclined to believe it's > > >>>>>>>> > > >>>>>>>> genuinely > > >>>>>>> > > >>>>>>> out of space. I suppose the question is why the external sort is > so > > >>>>>>>> > > >>>>>>>> huge. > > >>>>>>> > > >>>>>> What is the query plan? Maybe that will shed light on a possible > > >>>>>> cause. > > >>>>>> > > >>>>>>> On Tue, Aug 23, 2016 at 9:59 AM, Wail Alkowaileet < > > >>>>>>>> wael....@gmail.com > > >>>>>>>> wrote: > > >>>>>>>> > > >>>>>>>> I was monitoring Inodes ... it didn't go beyond 1%. > > >>>>>>>> > > >>>>>>>>> On Tue, Aug 23, 2016 at 7:58 PM, Wail Alkowaileet < > > >>>>>>>>> wael....@gmail.com > > >>>>>>>>> wrote: > > >>>>>>>>> > > >>>>>>>>> Hi Chris and Mike, > > >>>>>>>>> > > >>>>>>>>>> Actually I was monitoring it to see what's going on: > > >>>>>>>>>> > > >>>>>>>>>> - The size of each partition is about 40GB (80GB in total > > per > > >>>>>>>>>> iodevice). > > >>>>>>>>>> - The runs took 157GB per iodevice (about 2x of the > dataset > > >>>>>>>>>> size). > > >>>>>>>>>> Each run takes either of 128MB or 96MB of storage. > > >>>>>>>>>> - At a certain time, there were 522 runs. > > >>>>>>>>>> > > >>>>>>>>>> I even tried to create a BTree Index to see if that happens as > > >>>>>>>>>> well. > > >>>>>>>>>> > > >>>>>>>>>> I > > >>>>>>>>> > > >>>>>>>> created two BTree indexes one for the *location* and one for the > > >>>>>>>> > > >>>>>>>>> *caller > > >>>>>>>>> *and > > >>>>>>>>> > > >>>>>>>>> they were created successfully. The sizes of the runs didn't > take > > >>>>>>>>>> > > >>>>>>>>>> anyway > > >>>>>>>>> near that. > > >>>>>>>>> > > >>>>>>>>>> Logs are attached. > > >>>>>>>>>> > > >>>>>>>>>> On Tue, Aug 23, 2016 at 7:19 PM, Mike Carey < > dtab...@gmail.com> > > >>>>>>>>>> > > >>>>>>>>>> wrote: > > >>>>>>>>> > > >>>>>>>> I think we might have "file GC issues" - I vaguely remember that > > we > > >>>>>>>> > > >>>>>>>>> don't > > >>>>>>>>>> (or at least didn't once upon a time) proactively remove > > >>>>>>>>>> unnecessary > > >>>>>>>>>> run > > >>>>>>>>>> > > >>>>>>>>> files - removing all of them at end-of-job instead of at the > end > > of > > >>>>>>>>> > > >>>>>>>>>> the > > >>>>>>>>>> > > >>>>>>>>> execution phase that uses their contents. We may also have an > > >>>>>>>>> > > >>>>>>>>>> "Amdahl > > >>>>>>>>>> > > >>>>>>>>> problem" right now with our sort since we serialize phase two > of > > >>>>>>>> > > >>>>>>>>> parallel > > >>>>>>>>>> sorts - though this is not a query, it's index build, so that > > >>>>>>>>>> shouldn't > > >>>>>>>>>> > > >>>>>>>>> be > > >>>>>>>>> > > >>>>>>>>> it. It would be interesting to put a df/sleep script on each > of > > >>>>>>>>>> the > > >>>>>>>>>> nodes > > >>>>>>>>>> when this is happening - actually a script that monitors the > > temp > > >>>>>>>>>> file > > >>>>>>>>>> > > >>>>>>>>> directory - and watch the lifecycle happen and the sizes > > change.... > > >>>>>>>> > > >>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> On 8/23/16 2:06 AM, Chris Hillery wrote: > > >>>>>>>>>>> > > >>>>>>>>>>> When you get the "disk full" warning, do a quick "df -i" on > the > > >>>>>>>>>>> device > > >>>>>>>>>>> > > >>>>>>>>>> - > > >>>>>>>>> > > >>>>>>>>> possibly you've run out of inodes even if the space isn't all > > used > > >>>>>>>>>> > > >>>>>>>>>>> up. > > >>>>>>>>>>> > > >>>>>>>>>> It's > > >>>>>>>>> > > >>>>>>>>>> unlikely because I don't think AsterixDB creates a bunch of > > small > > >>>>>>>>>>>> > > >>>>>>>>>>>> files, > > >>>>>>>>>>> > > >>>>>>>>>> but worth checking. > > >>>>>>>>>> > > >>>>>>>>>>> If that's not it, then can you share the full exception and > > stack > > >>>>>>>>>>>> > > >>>>>>>>>>>> trace? > > >>>>>>>>>>> > > >>>>>>>>>> Ceej > > >>>>>>>>>> > > >>>>>>>>>>> aka Chris Hillery > > >>>>>>>>>>>> > > >>>>>>>>>>>> On Tue, Aug 23, 2016 at 1:59 AM, Wail Alkowaileet < > > >>>>>>>>>>>> > > >>>>>>>>>>>> wael....@gmail.com> > > >>>>>>>>>>> > > >>>>>>>>>> wrote: > > >>>>>>>>> > > >>>>>>>>>> I just cleared the hard drives to get 80% free space. I still > > get > > >>>>>>>>>>>> > > >>>>>>>>>>>> the > > >>>>>>>>>>> > > >>>>>>>>>> same > > >>>>>>>> > > >>>>>>>>> issue. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> The data contains: > > >>>>>>>>>>>>> 1- 2887453794 records. > > >>>>>>>>>>>>> 2- Schema: > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> create type CDRType as { > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> id:uuid, > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> 'date':string, > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> 'time':string, > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> 'duration':int64, > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> 'caller':int64, > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> 'callee':int64, > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> location:point? > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> } > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> On Tue, Aug 23, 2016 at 9:06 AM, Wail Alkowaileet < > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> wael....@gmail.com > > >>>>>>>>>>>> > > >>>>>>>>>>> wrote: > > >>>>>>>>> > > >>>>>>>>>> Dears, > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> I have a dataset of size 290GB loaded in a 3 NCs each of > > which > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> has > > >>>>>>>>>>>>> > > >>>>>>>>>>>> 2x500GB > > >>>>>>>> > > >>>>>>>>> SSD. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>>> Each of NC has two IODevices (partitions) in each hard > drive > > >>>>>>>>>>>>>> (i.e > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> the > > >>>>>>>>>>>>> > > >>>>>>>>>>>> total is 4 iodevices per NC). After loading the data, each > > >>>>>>>>> > > >>>>>>>>>> Asterix > > >>>>>>>>>>>>> > > >>>>>>>>>>>> partition occupied 31GB. > > >>>>>>>> > > >>>>>>>>> The cluster has about 50% free space in each hard drive > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> (approximately > > >>>>>>>>>>>>> > > >>>>>>>>>>>> about 250GB free space in each hard drive). However, when I > > >>>>>>>>>> tried > > >>>>>>>>>> > > >>>>>>>>>>> to > > >>>>>>>>>>>>> > > >>>>>>>>>>>> create > > >>>>>>>>> > > >>>>>>>>>> an index of type RTree, I got an exception that no space left > in > > >>>>>>>>>>>>> the > > >>>>>>>>>>>>> > > >>>>>>>>>>>> hard > > >>>>>>>>> > > >>>>>>>>>> drive during the External Sort phase. > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> Is that normal ? > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> -- > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> *Regards,* > > >>>>>>>>>>>>>> Wail Alkowaileet > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> -- > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>> *Regards,* > > >>>>>>>>>>>>> Wail Alkowaileet > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> -- > > >>>>>>>>>> > > >>>>>>>>>> *Regards,* > > >>>>>>>>>> Wail Alkowaileet > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> -- > > >>>>>>>>> > > >>>>>>>>> *Regards,* > > >>>>>>>>> Wail Alkowaileet > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> -- > > >>>>>> > > >>>>>> *Regards,* > > >>>>>> Wail Alkowaileet > > >>>>>> > > >>>>>> > > >>>>> Best, > > >>>>> > > >>>>> Jianfeng Jia > > >>>>> PhD Candidate of Computer Science > > >>>>> University of California, Irvine > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > > > > > > > > -- > > > > *Regards,* > > Wail Alkowaileet > > >