Re: [Numpy-discussion] array of random numbers fails to construct

2015-12-08 Thread Sebastian
On 12/08/2015 02:17 AM, Warren Weckesser wrote:
> On Sun, Dec 6, 2015 at 6:55 PM, Allan Haldane  > wrote:
>
> It has also crossed my mind that np.random.randint and
> np.random.rand could use an extra 'dtype' keyword.
>
> +1.  Not a high priority, but it would be nice.
Opened an issue for this: https://github.com/numpy/numpy/issues/6790
> Warren
Sebastian



signature.asc
Description: OpenPGP digital signature
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] array of random numbers fails to construct

2015-12-08 Thread Stephan Hoyer
On Sun, Dec 6, 2015 at 3:55 PM, Allan Haldane 
 wrote:

>
> I've also often wanted to generate large datasets of random uint8 and
> uint16. As a workaround, this is something I have used:
>
> np.ndarray(100, 'u1', np.random.bytes(100))
>
> It has also crossed my mind that np.random.randint and np.random.rand
> could use an extra 'dtype' keyword. It didn't look easy to implement though.
>

Another workaround that avoids creating a copy is to use the view method,
e.g.,
np.random.randint(np.iinfo(int).min, np.iinfo(int).max,
size=(1,)).view(np.uint8)  # creates 8 random bytes

Cheers,
Stephan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] array of random numbers fails to construct

2015-12-08 Thread Allan Haldane
On 12/08/2015 07:40 PM, Stephan Hoyer wrote:
> On Sun, Dec 6, 2015 at 3:55 PM, Allan Haldane  > wrote:
> 
> 
> I've also often wanted to generate large datasets of random uint8
> and uint16. As a workaround, this is something I have used:
> 
> np.ndarray(100, 'u1', np.random.bytes(100))
> 
> It has also crossed my mind that np.random.randint and
> np.random.rand could use an extra 'dtype' keyword. It didn't look
> easy to implement though.
> 
>  
> Another workaround that avoids creating a copy is to use the view
> method, e.g.,
> np.random.randint(np.iinfo(int).min, np.iinfo(int).max,
> size=(1,)).view(np.uint8)  # creates 8 random bytes

Just to note, the line I pasted doesn't copy either, according to the
OWNDATA flag.

Cheers,
Allan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] array of random numbers fails to construct

2015-12-08 Thread Allan Haldane
On 12/08/2015 08:01 PM, Allan Haldane wrote:
> On 12/08/2015 07:40 PM, Stephan Hoyer wrote:
>> On Sun, Dec 6, 2015 at 3:55 PM, Allan Haldane > > wrote:
>>
>>
>> I've also often wanted to generate large datasets of random uint8
>> and uint16. As a workaround, this is something I have used:
>>
>> np.ndarray(100, 'u1', np.random.bytes(100))
>>
>> It has also crossed my mind that np.random.randint and
>> np.random.rand could use an extra 'dtype' keyword. It didn't look
>> easy to implement though.
>>
>>  
>> Another workaround that avoids creating a copy is to use the view
>> method, e.g.,
>> np.random.randint(np.iinfo(int).min, np.iinfo(int).max,
>> size=(1,)).view(np.uint8)  # creates 8 random bytes
> 
> Just to note, the line I pasted doesn't copy either, according to the
> OWNDATA flag.
> 
> Cheers,
> Allan

Oops, but I forgot my version is readonly. If you want to write to it
you do need to make a copy, that's true.

Allan

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] array of random numbers fails to construct

2015-12-08 Thread Matthew Brett
Hi,

On Tue, Dec 8, 2015 at 4:40 PM, Stephan Hoyer  wrote:
> On Sun, Dec 6, 2015 at 3:55 PM, Allan Haldane 
> wrote:
>>
>>
>> I've also often wanted to generate large datasets of random uint8 and
>> uint16. As a workaround, this is something I have used:
>>
>> np.ndarray(100, 'u1', np.random.bytes(100))
>>
>> It has also crossed my mind that np.random.randint and np.random.rand
>> could use an extra 'dtype' keyword. It didn't look easy to implement though.
>
>
> Another workaround that avoids creating a copy is to use the view method,
> e.g.,
> np.random.randint(np.iinfo(int).min, np.iinfo(int).max,
> size=(1,)).view(np.uint8)  # creates 8 random bytes

I think that is not quite (pseudo) random because the second parameter
to randint is the max value plus 1 - and:

np.random.random_integers(np.iinfo(int).min, np.iinfo(int).max + 1,
size=(1,)).view(np.uint8)

gives:

OverflowError: Python int too large to convert to C long

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] array of random numbers fails to construct

2015-12-07 Thread Elliot Hallmark
David,

>I'm concluding that the .astype(np.uint8) is applied after the array is
constructed, instead of during the process.

That is how python works in general.  astype is a method of an array, so
randint needs to return the array before there is something with an astype
method to call.  A dtype keyword arg to randint, on the otherhand, would
influence the construction of the array.

Elliot
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] array of random numbers fails to construct

2015-12-07 Thread Warren Weckesser
On Sun, Dec 6, 2015 at 6:55 PM, Allan Haldane 
wrote:

>
> I've also often wanted to generate large datasets of random uint8 and
> uint16. As a workaround, this is something I have used:
>
> np.ndarray(100, 'u1', np.random.bytes(100))
>
> It has also crossed my mind that np.random.randint and np.random.rand
> could use an extra 'dtype' keyword.



+1.  Not a high priority, but it would be nice.

Warren



> It didn't look easy to implement though.
>
> Allan
>
> On 12/06/2015 04:55 PM, DAVID SAROFF (RIT Student) wrote:
>
>> Matthew,
>>
>> That looks right. I'm concluding that the .astype(np.uint8) is applied
>> after the array is constructed, instead of during the process. This
>> random array is a test case. In the production analysis of radio
>> telescope data this is how the data comes in, and there is no  problem
>> with 10GBy files.
>> linearInputData = np.fromfile(dataFile, dtype = np.uint8, count = -1)
>> spectrumArray = linearInputData.reshape(nSpectra,sizeSpectrum)
>>
>>
>> On Sun, Dec 6, 2015 at 4:07 PM, Matthew Brett > > wrote:
>>
>> Hi,
>>
>> On Sun, Dec 6, 2015 at 12:39 PM, DAVID SAROFF (RIT Student)
>> > wrote:
>> > This works. A big array of eight bit random numbers is constructed:
>> >
>> > import numpy as np
>> >
>> > spectrumArray = np.random.randint(0,255,
>> (2**20,2**12)).astype(np.uint8)
>> >
>> >
>> >
>> > This fails. It eats up all 64GBy of RAM:
>> >
>> > spectrumArray = np.random.randint(0,255,
>> (2**21,2**12)).astype(np.uint8)
>> >
>> >
>> > The difference is a factor of two, 2**21 rather than 2**20, for the
>> extent
>> > of the first axis.
>>
>> I think what's happening is that this:
>>
>> np.random.randint(0,255, (2**21,2**12))
>>
>> creates 2**33 random integers, which (on 64-bit) will be of dtype
>> int64 = 8 bytes, giving total size 2 ** (21 + 12 + 6) = 2 ** 39 bytes
>> = 512 GiB.
>>
>> Cheers,
>>
>> Matthew
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org 
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>>
>>
>> --
>> David P. Saroff
>> Rochester Institute of Technology
>> 54 Lomb Memorial Dr, Rochester, NY 14623
>> david.sar...@mail.rit.edu  | (434)
>> 227-6242
>>
>>
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] array of random numbers fails to construct

2015-12-06 Thread DAVID SAROFF (RIT Student)
Allan,

I see with a google search on your name that you are in the physics
department at Rutgers. I got my BA in Physics there. 1975. Biological
physics. A thought: Is there an entropy that can be assigned to the dna in
an organism? I don't mean the usual thing, coupled to the heat bath.
Evolution blindly explores metabolic and signalling pathways, and tends
towards disorder, as long as it functions. Someone working out signaling
pathways some years ago wrote that they were senselessly complex, branched
and interlocked. I think that is to be expected. Evolution doesn't find
minimalist, clear, rational solutions. Look at the amazon rain forest. What
are all those beetles and butterflies and frogs for? It is the wrong
question. I think some measure of the complexity could be related to the
amount of time that ecosystem has existed. Similarly for genomes.

On Sun, Dec 6, 2015 at 6:55 PM, Allan Haldane 
wrote:

>
> I've also often wanted to generate large datasets of random uint8 and
> uint16. As a workaround, this is something I have used:
>
> np.ndarray(100, 'u1', np.random.bytes(100))
>
> It has also crossed my mind that np.random.randint and np.random.rand
> could use an extra 'dtype' keyword. It didn't look easy to implement though.
>
> Allan
>
> On 12/06/2015 04:55 PM, DAVID SAROFF (RIT Student) wrote:
>
>> Matthew,
>>
>> That looks right. I'm concluding that the .astype(np.uint8) is applied
>> after the array is constructed, instead of during the process. This
>> random array is a test case. In the production analysis of radio
>> telescope data this is how the data comes in, and there is no  problem
>> with 10GBy files.
>> linearInputData = np.fromfile(dataFile, dtype = np.uint8, count = -1)
>> spectrumArray = linearInputData.reshape(nSpectra,sizeSpectrum)
>>
>>
>> On Sun, Dec 6, 2015 at 4:07 PM, Matthew Brett > > wrote:
>>
>> Hi,
>>
>> On Sun, Dec 6, 2015 at 12:39 PM, DAVID SAROFF (RIT Student)
>> > wrote:
>> > This works. A big array of eight bit random numbers is constructed:
>> >
>> > import numpy as np
>> >
>> > spectrumArray = np.random.randint(0,255,
>> (2**20,2**12)).astype(np.uint8)
>> >
>> >
>> >
>> > This fails. It eats up all 64GBy of RAM:
>> >
>> > spectrumArray = np.random.randint(0,255,
>> (2**21,2**12)).astype(np.uint8)
>> >
>> >
>> > The difference is a factor of two, 2**21 rather than 2**20, for the
>> extent
>> > of the first axis.
>>
>> I think what's happening is that this:
>>
>> np.random.randint(0,255, (2**21,2**12))
>>
>> creates 2**33 random integers, which (on 64-bit) will be of dtype
>> int64 = 8 bytes, giving total size 2 ** (21 + 12 + 6) = 2 ** 39 bytes
>> = 512 GiB.
>>
>> Cheers,
>>
>> Matthew
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org 
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>>
>>
>> --
>> David P. Saroff
>> Rochester Institute of Technology
>> 54 Lomb Memorial Dr, Rochester, NY 14623
>> david.sar...@mail.rit.edu  | (434)
>> 227-6242
>>
>>
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>



-- 
David P. Saroff
Rochester Institute of Technology
54 Lomb Memorial Dr, Rochester, NY 14623
david.sar...@mail.rit.edu | (434) 227-6242
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] array of random numbers fails to construct

2015-12-06 Thread Allan Haldane


I've also often wanted to generate large datasets of random uint8 and 
uint16. As a workaround, this is something I have used:


np.ndarray(100, 'u1', np.random.bytes(100))

It has also crossed my mind that np.random.randint and np.random.rand 
could use an extra 'dtype' keyword. It didn't look easy to implement though.


Allan

On 12/06/2015 04:55 PM, DAVID SAROFF (RIT Student) wrote:

Matthew,

That looks right. I'm concluding that the .astype(np.uint8) is applied
after the array is constructed, instead of during the process. This
random array is a test case. In the production analysis of radio
telescope data this is how the data comes in, and there is no  problem
with 10GBy files.
linearInputData = np.fromfile(dataFile, dtype = np.uint8, count = -1)
spectrumArray = linearInputData.reshape(nSpectra,sizeSpectrum)


On Sun, Dec 6, 2015 at 4:07 PM, Matthew Brett > wrote:

Hi,

On Sun, Dec 6, 2015 at 12:39 PM, DAVID SAROFF (RIT Student)
> wrote:
> This works. A big array of eight bit random numbers is constructed:
>
> import numpy as np
>
> spectrumArray = np.random.randint(0,255, (2**20,2**12)).astype(np.uint8)
>
>
>
> This fails. It eats up all 64GBy of RAM:
>
> spectrumArray = np.random.randint(0,255, (2**21,2**12)).astype(np.uint8)
>
>
> The difference is a factor of two, 2**21 rather than 2**20, for the extent
> of the first axis.

I think what's happening is that this:

np.random.randint(0,255, (2**21,2**12))

creates 2**33 random integers, which (on 64-bit) will be of dtype
int64 = 8 bytes, giving total size 2 ** (21 + 12 + 6) = 2 ** 39 bytes
= 512 GiB.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org 
https://mail.scipy.org/mailman/listinfo/numpy-discussion




--
David P. Saroff
Rochester Institute of Technology
54 Lomb Memorial Dr, Rochester, NY 14623
david.sar...@mail.rit.edu  | (434)
227-6242



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] array of random numbers fails to construct

2015-12-06 Thread Matthew Brett
Hi,

On Sun, Dec 6, 2015 at 12:39 PM, DAVID SAROFF (RIT Student)
 wrote:
> This works. A big array of eight bit random numbers is constructed:
>
> import numpy as np
>
> spectrumArray = np.random.randint(0,255, (2**20,2**12)).astype(np.uint8)
>
>
>
> This fails. It eats up all 64GBy of RAM:
>
> spectrumArray = np.random.randint(0,255, (2**21,2**12)).astype(np.uint8)
>
>
> The difference is a factor of two, 2**21 rather than 2**20, for the extent
> of the first axis.

I think what's happening is that this:

np.random.randint(0,255, (2**21,2**12))

creates 2**33 random integers, which (on 64-bit) will be of dtype
int64 = 8 bytes, giving total size 2 ** (21 + 12 + 6) = 2 ** 39 bytes
= 512 GiB.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] array of random numbers fails to construct

2015-12-06 Thread DAVID SAROFF (RIT Student)
Matthew,

That looks right. I'm concluding that the .astype(np.uint8) is applied
after the array is constructed, instead of during the process. This random
array is a test case. In the production analysis of radio telescope data
this is how the data comes in, and there is no  problem with 10GBy files.
linearInputData = np.fromfile(dataFile, dtype = np.uint8, count = -1)
spectrumArray = linearInputData.reshape(nSpectra,sizeSpectrum)


On Sun, Dec 6, 2015 at 4:07 PM, Matthew Brett 
wrote:

> Hi,
>
> On Sun, Dec 6, 2015 at 12:39 PM, DAVID SAROFF (RIT Student)
>  wrote:
> > This works. A big array of eight bit random numbers is constructed:
> >
> > import numpy as np
> >
> > spectrumArray = np.random.randint(0,255, (2**20,2**12)).astype(np.uint8)
> >
> >
> >
> > This fails. It eats up all 64GBy of RAM:
> >
> > spectrumArray = np.random.randint(0,255, (2**21,2**12)).astype(np.uint8)
> >
> >
> > The difference is a factor of two, 2**21 rather than 2**20, for the
> extent
> > of the first axis.
>
> I think what's happening is that this:
>
> np.random.randint(0,255, (2**21,2**12))
>
> creates 2**33 random integers, which (on 64-bit) will be of dtype
> int64 = 8 bytes, giving total size 2 ** (21 + 12 + 6) = 2 ** 39 bytes
> = 512 GiB.
>
> Cheers,
>
> Matthew
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>



-- 
David P. Saroff
Rochester Institute of Technology
54 Lomb Memorial Dr, Rochester, NY 14623
david.sar...@mail.rit.edu | (434) 227-6242
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] array of random numbers fails to construct

2015-12-06 Thread Jaime Fernández del Río
On Sun, Dec 6, 2015 at 10:07 PM, Matthew Brett 
wrote:

> Hi,
>
> On Sun, Dec 6, 2015 at 12:39 PM, DAVID SAROFF (RIT Student)
>  wrote:
> > This works. A big array of eight bit random numbers is constructed:
> >
> > import numpy as np
> >
> > spectrumArray = np.random.randint(0,255, (2**20,2**12)).astype(np.uint8)
> >
> >
> >
> > This fails. It eats up all 64GBy of RAM:
> >
> > spectrumArray = np.random.randint(0,255, (2**21,2**12)).astype(np.uint8)
> >
> >
> > The difference is a factor of two, 2**21 rather than 2**20, for the
> extent
> > of the first axis.
>
> I think what's happening is that this:
>
> np.random.randint(0,255, (2**21,2**12))
>
> creates 2**33 random integers, which (on 64-bit) will be of dtype
> int64 = 8 bytes, giving total size 2 ** (21 + 12 + 6) = 2 ** 39 bytes
> = 512 GiB.
>

8 is only 2**3, so it is "just" 64 GiB, which also explains why the half
sized array does work, but yes, that is most likely what's happening.

Jaime

>
> Cheers,
>
> Matthew
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>



-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] array of random numbers fails to construct

2015-12-06 Thread DAVID SAROFF (RIT Student)
This works. A big array of eight bit random numbers is constructed:

import numpy as np

spectrumArray = np.random.randint(0,255, (2**20,2**12)).astype(np.uint8)



This fails. It eats up all 64GBy of RAM:

spectrumArray = np.random.randint(0,255, (2**21,2**12)).astype(np.uint8)


The difference is a factor of two, 2**21 rather than 2**20, for the extent
of the first axis.

-- 
David P. Saroff
Rochester Institute of Technology
54 Lomb Memorial Dr, Rochester, NY 14623
david.sar...@mail.rit.edu | (434) 227-6242
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion