sizeof() is a bit tricky. Currently IntSet is considered not to have a
canonical binary representation, so sizeof() defaults to giving the
size of the struct. This is a bit sketchy, but it tends to be
convenient.

On Fri, Feb 28, 2014 at 11:49 AM, Ivar Nesje <[email protected]> wrote:
> My point was to illustrate that the limit will be arbitrary chosen. It is
> impossible to set a limit for how big numbers that should be allowed to be
> stored in a IntSet. It depends on the application, and can only be
> determined by the programmer. Sorry for not stating that clear. You seem to
> understand the issue better than my reasoning :P
>
> You should probably use a different storage mechanism that explicitly
> handles reading and writing the result unused parts of the array to disk if
> your IntSet have trouble fitting in memory. Maybe you will get good results
> if you just change the backing array in IntSet to a mmapped file?
> Unfortunately you have to copy all of the method definitions for IntSet,
> because we do not have inheritance from concrete types.
>
> PS: To get the size of the backing array, you have to use sizeof(s.bits),
> sizeof(s) is the constant size of the IntSet struct.
>
> Regards Ivar
>
> kl. 16:22:01 UTC+1 fredag 28. februar 2014 skrev David P. Sanders følgende:
>>
>>
>>
>> El viernes, 28 de febrero de 2014 08:41:37 UTC-6, Ivar Nesje escribió:
>>>
>>> The documentation states very clear that IntSet should only be used for
>>> dense collections, and that Set, should be used for sparse collections.
>>
>>
>> Agreed. Of course, this was just a toy example to test the limits of
>> IntSet.
>> In the real application that I am working towards, I want to think about
>> systems of size at least 10^5 x 10^5. Mapping pairs (x,y) in this system to
>> a single number gives up to 10^10,
>> which is what I was testing.
>>
>>
>>>
>>>
>>>> Construct a sorted set of the integers generated by the given iterable
>>>> object, or an empty set. Implemented as a bit string, and therefore 
>>>> designed
>>>> for dense integer sets. If the set will be sparse (for example holding a
>>>> single very large integer), use Set instead.
>>>
>>>
>>> Do you happen to know a nice limit to how much memory IntSet should be
>>> allowed to use?
>>
>>
>> In this case, it seems to be using more memory than I have available on my
>> machine (4GB on my laptop).
>> I guess my point is that normally I would expect that to give me an
>> out-of-memory error, rather than enter an infinite loop producing garbage.
>>
>>
>>>
>>> On my laptop 100 MB would be more than I can afford
>>
>>
>> Not sure what you mean by that -- doesn't it rather depend on the
>> application? If I am doing a heavy computation on my laptop over night, I am
>> happy for it to use all available memory.
>>
>>
>>>
>>> , but that would make IntSet unusable for bigger calculations on bigger
>>> systems, so it should be no smaller than 10 GB.
>>
>>
>> I have another machine with a lot of memory (128 GB), so I certainly do
>> not want to impose an arbitrary restriction.
>>
>>
>> David.
>>
>>>
>>>
>>> Ivar
>>>
>>> kl. 15:16:33 UTC+1 fredag 28. februar 2014 skrev David P. Sanders
>>> følgende:
>>>>
>>>>
>>>> I am investigating possible data structures for an application.
>>>> Here is an "interesting" behaviour in IntSet, which is no doubt to do
>>>> with the implementation.
>>>> Maybe it should just throw an exception if someone tries to add a really
>>>> large integer like this!
>>>>
>>>>
>>>> julia> s = IntSet()
>>>> IntSet()
>>>>
>>>> julia> push!(s, 100000)
>>>> IntSet(100000)
>>>>
>>>> julia> sizeof(s)
>>>> 24
>>>>
>>>> julia> push!(s, 1000000)
>>>> IntSet(100000, 1000000)
>>>>
>>>> julia> sizeof(s)
>>>> 24
>>>>
>>>> julia> push!(s, 10000000)
>>>> IntSet(100000, 1000000, 10000000)
>>>>
>>>> julia> push!(s, 100000000)
>>>> IntSet(100000, 1000000, 10000000, 100000000)
>>>>
>>>> julia> push!(s, 1000000000)
>>>> IntSet(100000, 1000000, 10000000, 100000000, 1000000000)
>>>>
>>>> julia> sizeof(s)
>>>> 24
>>>>
>>>> julia> push!(s, 10000000000)
>>>> IntSet(100000, 1000000, 10000000, 100000000, 1000000000, 1410065408,
>>>> 1410065408, 1410065408, 1410065408, 1410065408, 1410065408, 1410065408,
>>>> 1410065408, 1410065408, 1410065408, 1410065408, 1410065408, 1410065408,
>>>> 1410065408, 1410065408, 1410065408, 1410065408^CEvaluation succeeded, but 
>>>> an
>>>> error occurred while showing value of type IntSet:
>>>> ERROR: interrupt
>>>>  in show at intset.jl:172
>>>>  in anonymous at show.jl:973
>>>>  in showlimited at show.jl:972
>>>>  in writemime at repl.jl:2
>>>>  in display at multimedia.jl:117
>>>>  in display at multimedia.jl:119
>>>>  in display at multimedia.jl:151

Reply via email to