Hi Gil
You would need to prune the resulting Row as well based on the requested
columns.
Ram
Sent from my iPhone
On Jul 7, 2015, at 3:12 AM, Gil Vernik g...@il.ibm.com wrote:
Hi All,
I wanted to experiment a little bit with TableScan and PrunedScan.
My first test was to print columns from various SQL queries.
To make this test easier, i just took spark-csv and i replaced TableScan with
PrunedScan.
I then changed buildScan method of CsvRelation from
def BuildScan = {
to
def buildScan(requiredColumns: Array[String]) = {…
This was the only modification i did to CsvRelation.scala. And I added print
of requiredColums to log.
I then took the same CSV file and run very simple SELECT query on it.
I noticed that when CsvRelation used TableScan - all worked correctly.
But when i used PrunedScan - it didn’t worked and returned empty columns / or
columns in wrong order.
Why is this happens? Is it some bug? Because I thought that PrunedScan
suppose to work exactly the same as TableScan and i can modify freely
TableScan to PrunedScan. I thought that the only difference is that buildScan
of PrunedScan has requiredColumns as parameter.
Can someone explain me the behavior i saw?
I am using Spark 1.5 from trunk.
Thanks a lot
Gil.