Xiangrui Meng created SPARK-27534: ------------------------------------- Summary: Do not load `content` column in binary data source if it is not selected Key: SPARK-27534 URL: https://issues.apache.org/jira/browse/SPARK-27534 Project: Spark Issue Type: Story Components: SQL Affects Versions: 3.0.0 Reporter: Xiangrui Meng
A follow-up task from SPARK-25348. To save I/O cost, Spark shouldn't attempt to read the file if users didn't request the `content` column. For example: {code} spark.read.format("binaryFile").load(path).filter($"length" < 1000000).count() {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org