jinchengchenghh commented on PR #5447:
URL: 
https://github.com/apache/incubator-gluten/pull/5447#issuecomment-2105615316

   TPCH SF2000 Q6 performance, query:
   `select        sum(l_extendedprice * l_discount) as revenue from       
lineitem where  l_shipdate >= '1994-01-01'      and l_shipdate < '1995-01-01'   
and l_discount between .06 - 0.01 and .06 + 0.01        and l_quantity < 24`
   
   lineitem data: 622G
   <html xmlns:v="urn:schemas-microsoft-com:vml"
   xmlns:o="urn:schemas-microsoft-com:office:office"
   xmlns:x="urn:schemas-microsoft-com:office:excel"
   xmlns="http://www.w3.org/TR/REC-html40";>
   
   <head>
   
   <meta name=ProgId content=Excel.Sheet>
   <meta name=Generator content="Microsoft Excel 15">
   <link id=Main-File rel=Main-File
   href="file:///C:/Users/cjin/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
   <link rel=File-List
   
href="file:///C:/Users/cjin/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
   <style>
   <!--table
        {mso-displayed-decimal-separator:"\.";
        mso-displayed-thousand-separator:"\,";}
   @page
        {margin:.75in .7in .75in .7in;
        mso-header-margin:.3in;
        mso-footer-margin:.3in;}
   tr
        {mso-height-source:auto;}
   col
        {mso-width-source:auto;}
   br
        {mso-data-placement:same-cell;}
   td
        {padding-top:1px;
        padding-right:1px;
        padding-left:1px;
        mso-ignore:padding;
        color:black;
        font-size:11.0pt;
        font-weight:400;
        font-style:normal;
        text-decoration:none;
        font-family:Calibri, sans-serif;
        mso-font-charset:0;
        mso-number-format:General;
        text-align:general;
        vertical-align:bottom;
        border:none;
        mso-background-source:auto;
        mso-pattern:auto;
        mso-protection:locked visible;
        white-space:nowrap;
        mso-rotate:0;}
   -->
   </style>
   </head>
   
   <body link="#0563C1" vlink="#954F72">
   
   
   csv gluten without native reader | csv gluten native csv reader
   -- | --
   8333.039907 | 2456
   
   
   
   </body>
   
   </html>
   
   Test script:
   ```
   val schema = new StructType().add("l_orderkey", LongType).add("l_partkey", 
LongType).add("l_suppkey", LongType).add("l_linenumber", 
LongType).add("l_quantity", DoubleType).add("l_extendedprice", 
DoubleType).add("l_discount", DoubleType).add("l_tax", 
DoubleType).add("l_returnflag", StringType).add("l_linestatus", 
StringType).add("l_shipdate", DateType).add("l_commitdate", 
DateType).add("l_receiptdate", DateType).add("l_shipinstruct", 
StringType).add("l_shipmode", StringType).add("l_comment", StringType)
   
   val lineitem = 
spark.read.format("csv").option("header","true").schema(schema).load("file:///mnt/DP_disk2/tpch/csvdata/")
   spark.sql(q6)
   ```
   Note: because the file schema should match Arrow schema, so we should 
specify the schema by `.schema(arrow_matched_schema)`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to