korbit-ai[bot] commented on code in PR #34808: URL: https://github.com/apache/superset/pull/34808#discussion_r2292629334
########## superset/result_set.py: ########## @@ -135,21 +135,21 @@ def __init__( # pylint: disable=too-many-locals # noqa: C901 if data and (not isinstance(data, list) or not isinstance(data[0], tuple)): data = [tuple(row) for row in data] array = np.array(data, dtype=numpy_dtype) - if array.size > 0: - for column in column_names: - try: - pa_data.append(pa.array(array[column].tolist())) - except ( - pa.lib.ArrowInvalid, - pa.lib.ArrowTypeError, - pa.lib.ArrowNotImplementedError, - ValueError, - TypeError, # this is super hackey, - # https://issues.apache.org/jira/browse/ARROW-7855 - ): - # attempt serialization of values as strings - stringified_arr = stringify_values(array[column]) - pa_data.append(pa.array(stringified_arr.tolist())) + + for column in column_names: + try: + pa_data.append(pa.array(array[column].tolist())) + except ( + pa.lib.ArrowInvalid, + pa.lib.ArrowTypeError, + pa.lib.ArrowNotImplementedError, + ValueError, + TypeError, # this is super hackey, + # https://issues.apache.org/jira/browse/ARROW-7855 + ): + # attempt serialization of values as strings + stringified_arr = stringify_values(array[column]) + pa_data.append(pa.array(stringified_arr.tolist())) Review Comment: ### Inefficient Array Type Conversions <sub></sub> <details> <summary>Tell me more</summary> ###### What is the issue? Converting array to list then back to Arrow array with intermediate stringification creates unnecessary memory allocations. ###### Why this matters Multiple conversions between array types and unnecessary string conversions can significantly impact performance with large datasets. ###### Suggested change ∙ *Feature Preview* Try to minimize conversions by directly creating Arrow arrays or batching conversions: ```python # Convert data directly to Arrow array when possible try: pa_data.append(pa.array(array[column], type=infer_arrow_type(column))) except: # Fallback to string conversion only when necessary stringified_arr = stringify_values(array[column]) pa_data.append(pa.array(stringified_arr)) ``` ###### Provide feedback to improve future suggestions [](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/05549741-5b18-4ae6-9d4e-dc5f742e3263/upvote) [](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/05549741-5b18-4ae6-9d4e-dc5f742e3263?what_not_true=true) [](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/05549741-5b18-4ae6-9d4e-dc5f742e3263?what_out_of_scope=true) [](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/05549741-5b18-4ae6-9d4e-dc5f742e3263?what_not_in_standard=true) [](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/05549741-5b18-4ae6-9d4e-dc5f742e3263) </details> <sub> 💬 Looking for more details? Reply to this comment to chat with Korbit. </sub> <!--- korbi internal id:6fca1a94-175d-498f-93c6-422867826ce2 --> [](6fca1a94-175d-498f-93c6-422867826ce2) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org For additional commands, e-mail: notifications-h...@superset.apache.org