sizeof() is a bit tricky. Currently IntSet is considered not to have a canonical binary representation, so sizeof() defaults to giving the size of the struct. This is a bit sketchy, but it tends to be convenient.
On Fri, Feb 28, 2014 at 11:49 AM, Ivar Nesje <[email protected]> wrote: > My point was to illustrate that the limit will be arbitrary chosen. It is > impossible to set a limit for how big numbers that should be allowed to be > stored in a IntSet. It depends on the application, and can only be > determined by the programmer. Sorry for not stating that clear. You seem to > understand the issue better than my reasoning :P > > You should probably use a different storage mechanism that explicitly > handles reading and writing the result unused parts of the array to disk if > your IntSet have trouble fitting in memory. Maybe you will get good results > if you just change the backing array in IntSet to a mmapped file? > Unfortunately you have to copy all of the method definitions for IntSet, > because we do not have inheritance from concrete types. > > PS: To get the size of the backing array, you have to use sizeof(s.bits), > sizeof(s) is the constant size of the IntSet struct. > > Regards Ivar > > kl. 16:22:01 UTC+1 fredag 28. februar 2014 skrev David P. Sanders følgende: >> >> >> >> El viernes, 28 de febrero de 2014 08:41:37 UTC-6, Ivar Nesje escribió: >>> >>> The documentation states very clear that IntSet should only be used for >>> dense collections, and that Set, should be used for sparse collections. >> >> >> Agreed. Of course, this was just a toy example to test the limits of >> IntSet. >> In the real application that I am working towards, I want to think about >> systems of size at least 10^5 x 10^5. Mapping pairs (x,y) in this system to >> a single number gives up to 10^10, >> which is what I was testing. >> >> >>> >>> >>>> Construct a sorted set of the integers generated by the given iterable >>>> object, or an empty set. Implemented as a bit string, and therefore >>>> designed >>>> for dense integer sets. If the set will be sparse (for example holding a >>>> single very large integer), use Set instead. >>> >>> >>> Do you happen to know a nice limit to how much memory IntSet should be >>> allowed to use? >> >> >> In this case, it seems to be using more memory than I have available on my >> machine (4GB on my laptop). >> I guess my point is that normally I would expect that to give me an >> out-of-memory error, rather than enter an infinite loop producing garbage. >> >> >>> >>> On my laptop 100 MB would be more than I can afford >> >> >> Not sure what you mean by that -- doesn't it rather depend on the >> application? If I am doing a heavy computation on my laptop over night, I am >> happy for it to use all available memory. >> >> >>> >>> , but that would make IntSet unusable for bigger calculations on bigger >>> systems, so it should be no smaller than 10 GB. >> >> >> I have another machine with a lot of memory (128 GB), so I certainly do >> not want to impose an arbitrary restriction. >> >> >> David. >> >>> >>> >>> Ivar >>> >>> kl. 15:16:33 UTC+1 fredag 28. februar 2014 skrev David P. Sanders >>> følgende: >>>> >>>> >>>> I am investigating possible data structures for an application. >>>> Here is an "interesting" behaviour in IntSet, which is no doubt to do >>>> with the implementation. >>>> Maybe it should just throw an exception if someone tries to add a really >>>> large integer like this! >>>> >>>> >>>> julia> s = IntSet() >>>> IntSet() >>>> >>>> julia> push!(s, 100000) >>>> IntSet(100000) >>>> >>>> julia> sizeof(s) >>>> 24 >>>> >>>> julia> push!(s, 1000000) >>>> IntSet(100000, 1000000) >>>> >>>> julia> sizeof(s) >>>> 24 >>>> >>>> julia> push!(s, 10000000) >>>> IntSet(100000, 1000000, 10000000) >>>> >>>> julia> push!(s, 100000000) >>>> IntSet(100000, 1000000, 10000000, 100000000) >>>> >>>> julia> push!(s, 1000000000) >>>> IntSet(100000, 1000000, 10000000, 100000000, 1000000000) >>>> >>>> julia> sizeof(s) >>>> 24 >>>> >>>> julia> push!(s, 10000000000) >>>> IntSet(100000, 1000000, 10000000, 100000000, 1000000000, 1410065408, >>>> 1410065408, 1410065408, 1410065408, 1410065408, 1410065408, 1410065408, >>>> 1410065408, 1410065408, 1410065408, 1410065408, 1410065408, 1410065408, >>>> 1410065408, 1410065408, 1410065408, 1410065408^CEvaluation succeeded, but >>>> an >>>> error occurred while showing value of type IntSet: >>>> ERROR: interrupt >>>> in show at intset.jl:172 >>>> in anonymous at show.jl:973 >>>> in showlimited at show.jl:972 >>>> in writemime at repl.jl:2 >>>> in display at multimedia.jl:117 >>>> in display at multimedia.jl:119 >>>> in display at multimedia.jl:151
