[ 
https://issues.apache.org/jira/browse/SPARK-44037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Sysolyatin updated SPARK-44037:
--------------------------------------
    Description: 
CSV datasource supports maxColumns and maxCharsPerColumn options. But those two 
options do not allow limit row size properly.

For instance, if I want to limit the row size to be less than or equal to 100, 
and I set maxColumns to 10 and maxCharsPerColumn to 10, then
# User can not read column with size > 10 even if row size <= 100
# User can not read more than 10 columns even if row size <= 100

I suggest to add additional option maxCharsPerRow

  was:
CSV datasource supports maxColumns and maxCharsPerColumn options. But those two 
options do not allow limit row size properly.

For instance, if I want to limit the row size to be less than or equal to 100, 
and I set maxColumns to 10 and maxCharsPerColumn to 10, then
# User can not read column with size > 10 even if row size <= 100
# User can not read more than 10 columns where each column < 5 chars even if 
row size <= 100

I suggest to add additional option maxCharsPerRow


> Add maxCharsPerRow option for CSV datasource
> --------------------------------------------
>
>                 Key: SPARK-44037
>                 URL: https://issues.apache.org/jira/browse/SPARK-44037
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.4.0
>            Reporter: Dmitry Sysolyatin
>            Priority: Major
>
> CSV datasource supports maxColumns and maxCharsPerColumn options. But those 
> two options do not allow limit row size properly.
> For instance, if I want to limit the row size to be less than or equal to 
> 100, and I set maxColumns to 10 and maxCharsPerColumn to 10, then
> # User can not read column with size > 10 even if row size <= 100
> # User can not read more than 10 columns even if row size <= 100
> I suggest to add additional option maxCharsPerRow



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to