Amir Bar-Or created SPARK-26280: ----------------------------------- Summary: Spark will read entire CSV file even when limit is used Key: SPARK-26280 URL: https://issues.apache.org/jira/browse/SPARK-26280 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.3.1 Reporter: Amir Bar-Or
When you read CSV as below , the parser still waste time and read the entire file: var lineDF1 = spark.read .format("com.databricks.spark.csv") .option("header", "true") //reading the headers .option("mode", "DROPMALFORMED") .option("delimiter",",") .option("inferSchema", "false") .schema(line_schema) .load(i_lineitem) .lineDF1.limit(10) Even though a LocalLimit is created , this does not stop the FileScan and the parser from parsing entire file. Is it possible to push the limit down and stop the parsing ? -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org