[jira] [Commented] (ARROW-2051) [Python] Support serializing UUID objects to tables
[ https://issues.apache.org/jira/browse/ARROW-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932786#comment-16932786 ] Mitar commented on ARROW-2051: -- Sounds good. I will then explore how to do that through extension types. > [Python] Support serializing UUID objects to tables > --- > > Key: ARROW-2051 > URL: https://issues.apache.org/jira/browse/ARROW-2051 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.8.0 >Reporter: Omer Katz >Priority: Major > > UUID objects can be easily supported and can be represented as 128-bit > integers or a stream of bytes. > The fastest way I know to construct a UUID object is by using it's 128-bit > (16 bytes) integer representation. > > {code:java} > %timeit uuid.UUID(int=24197857161011715162171839636988778104) > 611 ns ± 6.27 ns per loop (mean ± std. dev. of 7 runs, 100 loops each) > %timeit uuid.UUID(bytes=b'\x124Vx\x124Vx\x124Vx\x124Vx') > 1.17 µs ± 7.5 ns per loop (mean ± std. dev. of 7 runs, 100 loops each) > %timeit uuid.UUID('12345678-1234-5678-1234-567812345678') > 1.47 µs ± 6.08 ns per loop (mean ± std. dev. of 7 runs, 100 loops each) > {code} > > Right now I have to do this manually which is pretty tedious. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-2051) [Python] Support serializing UUID objects to tables
[ https://issues.apache.org/jira/browse/ARROW-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932645#comment-16932645 ] Antoine Pitrou commented on ARROW-2051: --- 1) We don't have 128-bit numbers in Arrow. We do have fixed-size binary data. 2) Arrow now has extension types, so you could probably implement a UUID extension type (though the Python API for extension types may still be in flux). 3) The answer to "why not..." questions generally is that it costs maintenance time for us, so unless some contributor (such as you) wants to bear the maintenance cost it probably won't happen if we don't find it useful enough. > [Python] Support serializing UUID objects to tables > --- > > Key: ARROW-2051 > URL: https://issues.apache.org/jira/browse/ARROW-2051 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.8.0 >Reporter: Omer Katz >Priority: Major > > UUID objects can be easily supported and can be represented as 128-bit > integers or a stream of bytes. > The fastest way I know to construct a UUID object is by using it's 128-bit > (16 bytes) integer representation. > > {code:java} > %timeit uuid.UUID(int=24197857161011715162171839636988778104) > 611 ns ± 6.27 ns per loop (mean ± std. dev. of 7 runs, 100 loops each) > %timeit uuid.UUID(bytes=b'\x124Vx\x124Vx\x124Vx\x124Vx') > 1.17 µs ± 7.5 ns per loop (mean ± std. dev. of 7 runs, 100 loops each) > %timeit uuid.UUID('12345678-1234-5678-1234-567812345678') > 1.47 µs ± 6.08 ns per loop (mean ± std. dev. of 7 runs, 100 loops each) > {code} > > Right now I have to do this manually which is pretty tedious. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-2051) [Python] Support serializing UUID objects to tables
[ https://issues.apache.org/jira/browse/ARROW-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932642#comment-16932642 ] Wes McKinney commented on ARROW-2051: - You are more than welcome to contribute a UUID extension type for use in Python. > [Python] Support serializing UUID objects to tables > --- > > Key: ARROW-2051 > URL: https://issues.apache.org/jira/browse/ARROW-2051 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.8.0 >Reporter: Omer Katz >Priority: Major > > UUID objects can be easily supported and can be represented as 128-bit > integers or a stream of bytes. > The fastest way I know to construct a UUID object is by using it's 128-bit > (16 bytes) integer representation. > > {code:java} > %timeit uuid.UUID(int=24197857161011715162171839636988778104) > 611 ns ± 6.27 ns per loop (mean ± std. dev. of 7 runs, 100 loops each) > %timeit uuid.UUID(bytes=b'\x124Vx\x124Vx\x124Vx\x124Vx') > 1.17 µs ± 7.5 ns per loop (mean ± std. dev. of 7 runs, 100 loops each) > %timeit uuid.UUID('12345678-1234-5678-1234-567812345678') > 1.47 µs ± 6.08 ns per loop (mean ± std. dev. of 7 runs, 100 loops each) > {code} > > Right now I have to do this manually which is pretty tedious. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-2051) [Python] Support serializing UUID objects to tables
[ https://issues.apache.org/jira/browse/ARROW-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932618#comment-16932618 ] Mitar commented on ARROW-2051: -- I mean, you have 128 bit numbers in Arrow? So why not supporting converting UUID to that? > [Python] Support serializing UUID objects to tables > --- > > Key: ARROW-2051 > URL: https://issues.apache.org/jira/browse/ARROW-2051 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.8.0 >Reporter: Omer Katz >Priority: Major > > UUID objects can be easily supported and can be represented as 128-bit > integers or a stream of bytes. > The fastest way I know to construct a UUID object is by using it's 128-bit > (16 bytes) integer representation. > > {code:java} > %timeit uuid.UUID(int=24197857161011715162171839636988778104) > 611 ns ± 6.27 ns per loop (mean ± std. dev. of 7 runs, 100 loops each) > %timeit uuid.UUID(bytes=b'\x124Vx\x124Vx\x124Vx\x124Vx') > 1.17 µs ± 7.5 ns per loop (mean ± std. dev. of 7 runs, 100 loops each) > %timeit uuid.UUID('12345678-1234-5678-1234-567812345678') > 1.47 µs ± 6.08 ns per loop (mean ± std. dev. of 7 runs, 100 loops each) > {code} > > Right now I have to do this manually which is pretty tedious. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-2051) [Python] Support serializing UUID objects to tables
[ https://issues.apache.org/jira/browse/ARROW-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932447#comment-16932447 ] Joris Van den Bossche commented on ARROW-2051: -- What is the exact idea here? To provide a way to construct an array of UUIDs from python objects? The exact proposed enhancement is not fully clear to me. But, I would say, as long as we have no UUID type in Arrow (or an extension type, not sure if there are any plans on that), construction methods for UUID seems out of scope for pyarrow to me. > [Python] Support serializing UUID objects to tables > --- > > Key: ARROW-2051 > URL: https://issues.apache.org/jira/browse/ARROW-2051 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.8.0 >Reporter: Omer Katz >Priority: Major > > UUID objects can be easily supported and can be represented as 128-bit > integers or a stream of bytes. > The fastest way I know to construct a UUID object is by using it's 128-bit > (16 bytes) integer representation. > > {code:java} > %timeit uuid.UUID(int=24197857161011715162171839636988778104) > 611 ns ± 6.27 ns per loop (mean ± std. dev. of 7 runs, 100 loops each) > %timeit uuid.UUID(bytes=b'\x124Vx\x124Vx\x124Vx\x124Vx') > 1.17 µs ± 7.5 ns per loop (mean ± std. dev. of 7 runs, 100 loops each) > %timeit uuid.UUID('12345678-1234-5678-1234-567812345678') > 1.47 µs ± 6.08 ns per loop (mean ± std. dev. of 7 runs, 100 loops each) > {code} > > Right now I have to do this manually which is pretty tedious. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-2051) [Python] Support serializing UUID objects to tables
[ https://issues.apache.org/jira/browse/ARROW-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932371#comment-16932371 ] Antoine Pitrou commented on ARROW-2051: --- [~jorisvandenbossche] do you think it's worthwhile keeping an issue open for this? > [Python] Support serializing UUID objects to tables > --- > > Key: ARROW-2051 > URL: https://issues.apache.org/jira/browse/ARROW-2051 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.8.0 >Reporter: Omer Katz >Priority: Major > > UUID objects can be easily supported and can be represented as 128-bit > integers or a stream of bytes. > The fastest way I know to construct a UUID object is by using it's 128-bit > (16 bytes) integer representation. > > {code:java} > %timeit uuid.UUID(int=24197857161011715162171839636988778104) > 611 ns ± 6.27 ns per loop (mean ± std. dev. of 7 runs, 100 loops each) > %timeit uuid.UUID(bytes=b'\x124Vx\x124Vx\x124Vx\x124Vx') > 1.17 µs ± 7.5 ns per loop (mean ± std. dev. of 7 runs, 100 loops each) > %timeit uuid.UUID('12345678-1234-5678-1234-567812345678') > 1.47 µs ± 6.08 ns per loop (mean ± std. dev. of 7 runs, 100 loops each) > {code} > > Right now I have to do this manually which is pretty tedious. -- This message was sent by Atlassian Jira (v8.3.4#803005)