One thing to know about is sizehint(). The dict has to rehash when the number of items exceeds the number of available slots.
--Tim On Tuesday, April 01, 2014 10:43:52 AM Stefan Karpinski wrote: > If you can provide some example code, lots of people here are more than > happy to help performance optimize it, but without example code, it's hard > to give you anything more specific than don't allocate more than you have > to and don't make copies of things if you don't have to. Using mutating > APIs (functions with ! at the end) is helpful. The Dict itself isn't > allocating small objects on the heap, but you may very well be generating a > lot of garbage along the way. > > On Tue, Apr 1, 2014 at 10:34 AM, Freddy Chua <[email protected]> wrote: > > Hard to isolate my code. It may not be the Dictionary as I have also > > noticed abnormal behaviour with my own customised linked list. I also read > > in a pretty big chunk of data (1GB). But let me describe it briefly here. > > > > My program read in some data in the order of 10s of MB > > > > Then it reads in the 1GB data and start processing it, this is the second > > stage of the program. I ran several tests on this part, sometimes, I read > > in 10 mb only, sometimes i read in 20mb, sometimes 100 mb. Note, the data > > in here is independent of one another, it only takes up additional memory > > in the O/S. So reading in twice the amount of data should only require > > double the amount of time. > > > > What I discover is, the time spent does not scale linearly, infact, it > > increases polynomially. I suspect this is due to the way Julia handles > > data > > structures and objects with incontiguous memory allocation. > > > > Could someone give some tips on memory management? > > > > On Tuesday, April 1, 2014 10:26:43 PM UTC+8, Iain Dunning wrote: > >> Can you give _any_ sample code to demonstrate this behaviour? > >> > >> On Tuesday, April 1, 2014 9:00:20 AM UTC-4, Freddy Chua wrote: > >>> ObjectIdDict does not allow pre-defined types...... wouldn't that affect > >>> the performance too? > >>> > >>> On Tuesday, April 1, 2014 8:55:45 PM UTC+8, Isaiah wrote: > >>>> You could try ObjectIdDict, which is specialized for this use case. > >>>> > >>>> On Tue, Apr 1, 2014 at 6:51 AM, Freddy Chua <[email protected]> wrote: > >>>>> Just to add, the key is an object rather than the usual ASCIIString or > >>>>> Int64 > >>>>> > >>>>> On Tuesday, April 1, 2014 6:34:08 PM UTC+8, Freddy Chua wrote: > >>>>>> I am using Dict to store my values. Since it is hash table, I thought > >>>>>> that the performance would remain fairly constant even as the > >>>>>> dictionary > >>>>>> grows bigger. But this is not what I am experience at the moment. > >>>>>> When the > >>>>>> size of my Dict grows, the cost of retrieval increases as well. Can > >>>>>> someone > >>>>>> help me here??? I really need the dictionary to be fast...
