[
https://issues.apache.org/jira/browse/ARROW-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17405482#comment-17405482
]
Eduardo Ponce edited comment on ARROW-12657 at 8/26/21, 9:49 PM:
-----------------------------------------------------------------
The inverse operation, commonly called *hex()* converts an integer into a hex
string.
There are several things to consider for the hex-to-number conversion:
Python/numpy use [type casting for hex conversion to
integers|https://docs.python.org/3/library/functions.html#int]. Only *int()*
supports hex conversion, but not *int32()*, *int64()*, etc.
{code:python}
int('0xa', base=16) # 'base' option is mandatory
int('0xA', base=16) # case-insensitive
int('a', base=16) # '0x' is optional
int('000000a', base=16) # if '0x' is not given, any number of '0's to the left
of the first non-zero value are ignored
int(' a\t', base=16) # left/right whitespace are ignored
{code}
R uses a [function for hex
conversion|https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/numeric]
{code:r}
as.numeric('0xa') # equivalently, as.integer('0xa')
as.numeric('0XA') # case-insensitive
as.numeric('0xa') # '0x' is mandatory, there is no 'base' option
as.numeric('000000xa') # Warning: returns NA. '0's to the left are not
supported
as.numeric('0xa') # left/right whitespace are ignored
{code}
SQL uses a [conversion function for
hex-to-number|https://www.w3schools.com/SQL/func_sqlserver_convert.asp]
{code:sql}
SELECT CONVERT(int, 0xa)
SELECT CONVERT(int, 0XA) -- case-insensitive
SELECT CONVERT(int, 0xa) -- '0x' is mandatory, there is no 'base' option
SELECT CONVERT(int, 0000xa) -- Error. '0's to the left are not supported
SELECT CONVERT(int, '0xa') -- Error. Literal strings are not supported
{code}
was (Author: edponce):
The inverse operation, commonly called *hex()* converts an integer into a hex
string.
There are several things to consider for the hex-to-number conversion:
Python/numpy use [type casting for hex conversion to
integers|https://docs.python.org/3/library/functions.html#int]. Only *int()*
supports hex conversion, but not *int32()*, *int64()*, etc.
{code:python}
int('0xa', base=16) # 'base' option is mandatory
int('0xA', base=16) # case-insensitive
int('a', base=16) # '0x' is optional
int('000000a', base=16) # if '0x' is not given, any number of '0's to the left
of the first non-zero value are ignored
int(' a\t', base=16) # left/right whitespace are ignored
{code}
R uses a [function for hex
conversion|https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/numeric]
{code:r}
as.numeric('0xa') # equivalently, as.integer('0xa')
as.numeric('0XA') # case-insensitive
as.numeric('0xa') # '0x' is mandatory, there is no 'base' option
as.numeric('000000xa') # Warning: returns NA. '0's to the left are not
supported
as.numeric('0xa') # left/right whitespace are ignored
{code}
SQL uses a [conversion function for
hex-to-number|https://www.w3schools.com/SQL/func_sqlserver_convert.asp]
{code:sql}
SELECT CONVERT(int, 0xa)
SELECT CONVERT(int, 0XA) # case-insensitive
SELECT CONVERT(int, 0xa) # '0x' is mandatory, there is no 'base' option
SELECT CONVERT(int, 0000xa) # Error. '0's to the left are not supported
SELECT CONVERT(int, '0xa') # Error. Literal strings are not supported
{code}
> [C++][Python][Compute] String hex to numeric conversion and bit shifting
> ------------------------------------------------------------------------
>
> Key: ARROW-12657
> URL: https://issues.apache.org/jira/browse/ARROW-12657
> Project: Apache Arrow
> Issue Type: New Feature
> Components: C++, Python
> Reporter: Franz
> Assignee: William Malpica
> Priority: Major
>
> Hi Apache Arrow Community,
> thanks for the great work on this project - it is really a game changer. I've
> started to use it more frequently since more and more compute kernels became
> available.
> However, I have a current requirement which I can't fulfill, yet. More
> concretely, I have hex values as strings. I need to convert them to numeric
> type (int) and apply bit shifting.
> Currently, I can't find a way to do so. I tried type casts (string to int)
> like
> {code:java}
> import pyarrow as pa
> array = pa.array(["0x2001591", "0x2000848", "0x2000123"])
> array.cast(pa.uint32()){code}
> However this results in _ArrowInvalid: Failed to parse string: '0x2000123' as
> a scalar of type uint32._
> Moreover, I will need to apply bit shifting once converted. I'm not sure if
> there is anything comparable in the compute kernels, yet.
> Thanks for any help in advance.
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)