I think that execption ("No space left on device") is just casted from the
native IOException. Therefore I would be inclined to believe it's genuinely
out of space. I suppose the question is why the external sort is so huge.
What is the query plan? Maybe that will shed light on a possible cause.On Tue, Aug 23, 2016 at 9:59 AM, Wail Alkowaileet <[email protected]> wrote: > I was monitoring Inodes ... it didn't go beyond 1%. > > On Tue, Aug 23, 2016 at 7:58 PM, Wail Alkowaileet <[email protected]> > wrote: > > > Hi Chris and Mike, > > > > Actually I was monitoring it to see what's going on: > > > > - The size of each partition is about 40GB (80GB in total per > > iodevice). > > - The runs took 157GB per iodevice (about 2x of the dataset size). > > Each run takes either of 128MB or 96MB of storage. > > - At a certain time, there were 522 runs. > > > > I even tried to create a BTree Index to see if that happens as well. I > > created two BTree indexes one for the *location* and one for the *caller > *and > > they were created successfully. The sizes of the runs didn't take anyway > > near that. > > > > Logs are attached. > > > > On Tue, Aug 23, 2016 at 7:19 PM, Mike Carey <[email protected]> wrote: > > > >> I think we might have "file GC issues" - I vaguely remember that we > don't > >> (or at least didn't once upon a time) proactively remove unnecessary run > >> files - removing all of them at end-of-job instead of at the end of the > >> execution phase that uses their contents. We may also have an "Amdahl > >> problem" right now with our sort since we serialize phase two of > parallel > >> sorts - though this is not a query, it's index build, so that shouldn't > be > >> it. It would be interesting to put a df/sleep script on each of the > nodes > >> when this is happening - actually a script that monitors the temp file > >> directory - and watch the lifecycle happen and the sizes change.... > >> > >> > >> > >> On 8/23/16 2:06 AM, Chris Hillery wrote: > >> > >>> When you get the "disk full" warning, do a quick "df -i" on the device > - > >>> possibly you've run out of inodes even if the space isn't all used up. > >>> It's > >>> unlikely because I don't think AsterixDB creates a bunch of small > files, > >>> but worth checking. > >>> > >>> If that's not it, then can you share the full exception and stack > trace? > >>> > >>> Ceej > >>> aka Chris Hillery > >>> > >>> On Tue, Aug 23, 2016 at 1:59 AM, Wail Alkowaileet <[email protected]> > >>> wrote: > >>> > >>> I just cleared the hard drives to get 80% free space. I still get the > >>>> same > >>>> issue. > >>>> > >>>> The data contains: > >>>> 1- 2887453794 records. > >>>> 2- Schema: > >>>> > >>>> create type CDRType as { > >>>> > >>>> id:uuid, > >>>> > >>>> 'date':string, > >>>> > >>>> 'time':string, > >>>> > >>>> 'duration':int64, > >>>> > >>>> 'caller':int64, > >>>> > >>>> 'callee':int64, > >>>> > >>>> location:point? > >>>> > >>>> } > >>>> > >>>> > >>>> On Tue, Aug 23, 2016 at 9:06 AM, Wail Alkowaileet <[email protected] > > > >>>> wrote: > >>>> > >>>> Dears, > >>>>> > >>>>> I have a dataset of size 290GB loaded in a 3 NCs each of which has > >>>>> > >>>> 2x500GB > >>>> > >>>>> SSD. > >>>>> > >>>>> Each of NC has two IODevices (partitions) in each hard drive (i.e the > >>>>> total is 4 iodevices per NC). After loading the data, each Asterix > >>>>> partition occupied 31GB. > >>>>> > >>>>> The cluster has about 50% free space in each hard drive > (approximately > >>>>> about 250GB free space in each hard drive). However, when I tried to > >>>>> > >>>> create > >>>> > >>>>> an index of type RTree, I got an exception that no space left in the > >>>>> hard > >>>>> drive during the External Sort phase. > >>>>> > >>>>> Is that normal ? > >>>>> > >>>>> > >>>>> -- > >>>>> > >>>>> *Regards,* > >>>>> Wail Alkowaileet > >>>>> > >>>>> > >>>> > >>>> -- > >>>> > >>>> *Regards,* > >>>> Wail Alkowaileet > >>>> > >>>> > >> > > > > > > -- > > > > *Regards,* > > Wail Alkowaileet > > > > > > -- > > *Regards,* > Wail Alkowaileet >
