Re: [OT] gcc: maximum length of an array?

2006-07-25 Thread Giorgos Keramidas
On 2006-07-27 17:02, "P.U.Kruppa" <[EMAIL PROTECTED]> wrote:
>On Sun, 23 Jul 2006, Giorgos Keramidas wrote:
>>On 2006-07-24 20:49, "P.U.Kruppa" <[EMAIL PROTECTED]> wrote:
>>> Hi,
>>> sorry for posting an [OT], but usually people on this list know
>>> everything :-)
>>>
>>> Since I don't know too much about programming I am frequently
>>> fascinated by simple things like Eratosthenes' sieve.  As you might
>>> remember, one has to create a boolean array for that. The longer the
>>> array the more primes can be found.
>>>
>>> With malloc() I can create an array of length 1 (10^8) and
>>> the first 5761455 primes are calculated in a few seconds.  So of
>>> course I would like to test length 10^9 but here my program crashes.
>>
>> If this is about integer values, which are probably 32-bit, you are
>> hitting the kern.maxdsiz limit of 512 MB.  An array of 100,000,000
>> 32-bit values takes up 4 * 100,000,000 = 400,000,000 (close to 400 MiB
>> of memory to store).  Anything above 512 MB in size will make the data
>> size of your program so big that it will overflow the data seg size:
>>
>>   $ ulimit -a
>>   core file size  (blocks, -c) unlimited
>>   data seg size   (kbytes, -d) 524288
>>   ...
>>
>> You can either increase kern.maxdsiz in your `/boot/loader.conf' file,
>> or redesign the algorithm to work with larger datasets by splitting them
>> in chunks that you can still process with 512 MB of data :)
>
> *How* can I effectively split my array up?

Not by using the original Sieve of Eratosthenes, that's for sure.

By sacrifising some of the speed, you can probably use secondary storage
though, to make sure that you keep at most 512 MB of data in physical
memory.

> How can I access an element arr[n] if n is bigger than INT_MAX ?
> I have tried some kind of linear/linked list, but that becomes
> disgustingly slow.

Actually, the limit of data offsets you can meaningfully access with a C
program is not INT_MAX, which may be as low as +32767 (see page 22 of
the ISO/IEC 9899:TC2 public draft of the C programming language[1]).

[1] Draft n1124 from http://www.open-std.org/JTC1/SC22/WG14/www/docs/

The largest size of object you can access with a conforming C program is
SIZE_MAX (see page 259 of the same PDF document).  The standard doesn't
require `size_t' to be much larger than `int' though, so this may still
be inadequate for processing huge datasets.

You have multiple options, the way I see it:

* Bump kern.maxdsiz to something higher (this can work for much
  larger datasets than 512 MB, but a little after 2 GB things start
  getting ugly again).

* Work on an amd64 system with LOTS of physical memory and a high
  kern.maxdsiz value.

* Try to find a variation of the Sieve of Eratosthenes that can work
  with smaller memory load (possibly sacrifising, as you guessed,
  some of the speed for space).

One possible variation would be to keep copies of the data you have
processed in secondary storage and load only the parts needed in
physical memory.  A simplistic implementation of the Sieve of
Eratosthenes may result in heavy thrashing if you just swap in and out
regions of the numeric range as they are being accessed though :(


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: [OT] gcc: maximum length of an array?

2006-07-25 Thread P.U.Kruppa

On Sun, 23 Jul 2006, Giorgos Keramidas wrote:


On 2006-07-24 20:49, "P.U.Kruppa" <[EMAIL PROTECTED]> wrote:

Hi,

sorry for posting an [OT], but usually people on this list know
everything :-)

Since I don't know too much about programming I am frequently
fascinated by simple things like Eratosthenes' sieve.  As you might
remember, one has to create a boolean array for that. The longer the
array the more primes can be found.

With malloc() I can create an array of length 1 (10^8) and the
first 5761455 primes are calculated in a few seconds.  So of course I
would like to test length 10^9 but here my program crashes.


If this is about integer values, which are probably 32-bit, you are
hitting the kern.maxdsiz limit of 512 MB.  An array of 100,000,000
32-bit values takes up 4 * 100,000,000 = 400,000,000 (close to 400 MiB
of memory to store).  Anything above 512 MB in size will make the data
size of your program so big that it will overflow the data seg size:

   $ ulimit -a
   core file size  (blocks, -c) unlimited
   data seg size   (kbytes, -d) 524288
   ...

You can either increase kern.maxdsiz in your `/boot/loader.conf' file,
or redesign the algorithm to work with larger datasets by splitting them
in chunks that you can still process with 512 MB of data :)
*How* can I effectively split my array up? 
How can I access an element arr[n] if n is bigger than INT_MAX ?
I have tried some kind of linear/linked list, but that becomes 
disgustingly slow.


Thanks,

Uli.






___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"





*
* Peter Ulrich Kruppa - Wuppertal - Germany *
*
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: [OT] gcc: maximum length of an array?

2006-07-25 Thread P.U.Kruppa

On Sun, 23 Jul 2006, Andrew Brampton wrote:

Can you show me the line you are using to malloc with, and the lines you are 
using to access the array...


The smallest unit you can malloc on is a byte, and if you are mallocing 
1 bytes, and using each byte as a single boolean value then you are 
wasting 7/8 of your array.


It might be better to do some bit masking to gain access to the other 7 bits.

Thanks for this idea Andrew!
it took me some time to implement it - since I am quite a n00b 
and never heard about bitmasking - but with the help of

http://c-faq.com/misc/bitsets.html
I could do 10^9 .

Uli.



Andrew

- Original Message - From: "P.U.Kruppa" <[EMAIL PROTECTED]>
To: 
Sent: Monday, July 24, 2006 7:49 PM
Subject: [OT] gcc: maximum length of an array?



Hi,

sorry for posting an [OT], but usually people on this list know everything 
:-)


Since I don't know too much about programming I am frequently fascinated by 
simple things like Eratosthenes' sieve.
As you might remember, one has to create a boolean array for that. The 
longer the array the more primes can be found.


With malloc() I can create an array of length 1 (10^8)
and the first 5761455 primes are calculated in a few seconds.
So of course I would like to test length 10^9 but here my program crashes.

So my questions:
 - is there some way to create a longer array?
 - or what are the alternatives?
 - do you know some kind of fine manual about this?

Regards and thanks for all answers,

Uli.


*
* Peter Ulrich Kruppa - Wuppertal - Germany *
*
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to 
"[EMAIL PROTECTED]"










*
* Peter Ulrich Kruppa - Wuppertal - Germany *
*
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: [OT] gcc: maximum length of an array?

2006-07-23 Thread Giorgos Keramidas
On 2006-07-24 20:49, "P.U.Kruppa" <[EMAIL PROTECTED]> wrote:
> Hi,
>
> sorry for posting an [OT], but usually people on this list know
> everything :-)
>
> Since I don't know too much about programming I am frequently
> fascinated by simple things like Eratosthenes' sieve.  As you might
> remember, one has to create a boolean array for that. The longer the
> array the more primes can be found.
>
> With malloc() I can create an array of length 1 (10^8) and the
> first 5761455 primes are calculated in a few seconds.  So of course I
> would like to test length 10^9 but here my program crashes.

If this is about integer values, which are probably 32-bit, you are
hitting the kern.maxdsiz limit of 512 MB.  An array of 100,000,000
32-bit values takes up 4 * 100,000,000 = 400,000,000 (close to 400 MiB
of memory to store).  Anything above 512 MB in size will make the data
size of your program so big that it will overflow the data seg size:

$ ulimit -a
core file size  (blocks, -c) unlimited
data seg size   (kbytes, -d) 524288
...

You can either increase kern.maxdsiz in your `/boot/loader.conf' file,
or redesign the algorithm to work with larger datasets by splitting them
in chunks that you can still process with 512 MB of data :)

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"