[jira] [Commented] (FLINK-32565) Support cast from NUMBER to BYTES

2023-08-04 Thread Hanyu Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17751138#comment-17751138
 ] 

Hanyu Zheng commented on FLINK-32565:
-

[~twalthr] , Through research, It seem that other vendors use cast but not 
convert.

> Support cast from NUMBER to BYTES
> -
>
> Key: FLINK-32565
> URL: https://issues.apache.org/jira/browse/FLINK-32565
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Hanyu Zheng
>Assignee: Hanyu Zheng
>Priority: Major
>  Labels: pull-request-available
>
> We are undertaking a task that requires casting from the DOUBLE type to BYTES 
> In particular, we have a INTEGER 1234. Our current approach is to convert 
> this INTEGER to BYTES  using the following SQL query:
> {code:java}
> SELECT CAST(1234 as BYTES);{code}
> {{ }}
> However, we encounter an issue when executing this query, potentially due to 
> an error in the conversion between INTEGER and BYTES. Our goal is to identify 
> and correct this issue so that our query can execute successfully. The tasks 
> involved are:
>  # Investigate and pinpoint the specific reason for the conversion failure 
> from INTEGER to BYTES.
>  # Design and implement a solution that enables our query to function 
> correctly.
>  # Test this solution across all required scenarios to ensure its robustness.
>  
> see also:
> 1. PostgreSQL: PostgreSQL supports casting from NUMBER types (INTEGER, 
> BIGINT, DECIMAL, etc.) to BYTES type (BYTEA). In PostgreSQL, you can use CAST 
> or TO_BINARY function for performing the conversion. URL: 
> [https://www.postgresql.org/docs/current/sql-expressions.html#SQL-SYNTAX-TYPE-CASTS]
> 2. MySQL: MySQL supports casting from NUMBER types (INTEGER, BIGINT, DECIMAL, 
> etc.) to BYTES type (BINARY or BLOB). In MySQL, you can use CAST or CONVERT 
> functions for performing the conversion. URL: 
> [https://dev.mysql.com/doc/refman/8.0/en/cast-functions.html]
> 3. Microsoft SQL Server: SQL Server supports casting from NUMBER types (INT, 
> BIGINT, NUMERIC, etc.) to BYTES type (VARBINARY or IMAGE). You can use CAST 
> or CONVERT functions for performing the conversion. URL: 
> [https://docs.microsoft.com/en-us/sql/t-sql/functions/cast-and-convert-transact-sql]
> 4. Oracle Database: Oracle supports casting from NUMBER types (NUMBER, 
> INTEGER, FLOAT, etc.) to BYTES type (RAW). You can use UTL_RAW.CAST_TO_RAW 
> function for performing the conversion. URL: 
> [https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/TO_BINARY_DOUBLE.html]
>  
> for the problem of bytes order may arise (little vs big endian). 
>  
> 1. Apache Hadoop: Hadoop, being an open-source framework, has to deal with 
> byte order issues across different platforms and architectures. The Hadoop 
> File System (HDFS) uses a technique called "sequence files," which include 
> metadata to describe the byte order of the data. This metadata ensures that 
> data is read and written correctly, regardless of the endianness of the 
> platform.
> 2. Apache Avro: Avro is a data serialization system used by various big data 
> frameworks like Hadoop and Apache Kafka. Avro uses a compact binary encoding 
> format that includes a marker for the byte order. This allows Avro to handle 
> endianness issues seamlessly when data is exchanged between systems with 
> different byte orders.
> 3. Apache Parquet: Parquet is a columnar storage format used in big data 
> processing frameworks like Apache Spark. Parquet uses a little-endian format 
> for encoding numeric values, which is the most common format on modern 
> systems. When reading or writing Parquet data, data processing engines 
> typically handle any necessary byte order conversions transparently.
> 4. Apache Spark: Spark is a popular big data processing engine that can 
> handle data on distributed systems. It relies on the underlying data formats 
> it reads (e.g., Avro, Parquet, ORC) to manage byte order issues. These 
> formats are designed to handle byte order correctly, ensuring that Spark can 
> handle data correctly on different platforms.
> 5. Google Cloud BigQuery: BigQuery is a serverless data warehouse offered by 
> Google Cloud. When dealing with binary data and endianness, BigQuery relies 
> on the data encoding format. For example, when loading data in Avro or 
> Parquet formats, these formats already include byte order information, 
> allowing BigQuery to handle data across different platforms correctly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32565) Support cast from NUMBER to BYTES

2023-07-24 Thread Hanyu Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746569#comment-17746569
 ] 

Hanyu Zheng commented on FLINK-32565:
-

[~twalthr] Ok

> Support cast from NUMBER to BYTES
> -
>
> Key: FLINK-32565
> URL: https://issues.apache.org/jira/browse/FLINK-32565
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Hanyu Zheng
>Assignee: Hanyu Zheng
>Priority: Major
>  Labels: pull-request-available
>
> We are undertaking a task that requires casting from the DOUBLE type to BYTES 
> In particular, we have a INTEGER 1234. Our current approach is to convert 
> this INTEGER to BYTES  using the following SQL query:
> {code:java}
> SELECT CAST(1234 as BYTES);{code}
> {{ }}
> However, we encounter an issue when executing this query, potentially due to 
> an error in the conversion between INTEGER and BYTES. Our goal is to identify 
> and correct this issue so that our query can execute successfully. The tasks 
> involved are:
>  # Investigate and pinpoint the specific reason for the conversion failure 
> from INTEGER to BYTES.
>  # Design and implement a solution that enables our query to function 
> correctly.
>  # Test this solution across all required scenarios to ensure its robustness.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32565) Support cast from NUMBER to BYTES

2023-07-24 Thread Timo Walther (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746450#comment-17746450
 ] 

Timo Walther commented on FLINK-32565:
--

[~hanyuzheng] In general, the use case is valid. But I'm wondering whether we 
should rather provide such functionality through a special function like 
"CONVERT". AFAIK SQL Server also distinguishes between CAST and CONVERT. The 
problem of bytes order may arise (little vs big endian). Could you perform some 
research how other big vendors deal with this problem?

> Support cast from NUMBER to BYTES
> -
>
> Key: FLINK-32565
> URL: https://issues.apache.org/jira/browse/FLINK-32565
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Hanyu Zheng
>Assignee: Hanyu Zheng
>Priority: Major
>
> We are undertaking a task that requires casting from the DOUBLE type to BYTES 
> In particular, we have a INTEGER 1234. Our current approach is to convert 
> this INTEGER to BYTES  using the following SQL query:
> {code:java}
> SELECT CAST(1234 as BYTES);{code}
> {{ }}
> However, we encounter an issue when executing this query, potentially due to 
> an error in the conversion between INTEGER and BYTES. Our goal is to identify 
> and correct this issue so that our query can execute successfully. The tasks 
> involved are:
>  # Investigate and pinpoint the specific reason for the conversion failure 
> from INTEGER to BYTES.
>  # Design and implement a solution that enables our query to function 
> correctly.
>  # Test this solution across all required scenarios to ensure its robustness.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)