chenpingzeng opened a new issue, #1512:
URL: https://github.com/apache/orc/issues/1512

   **Background info:**
   In a spark project, we are using orc c++ as acceleration lib to access hdfs 
files, comparing to original spark table scan with java/scala code.
   We found some tpcds-99 sql run fail with data inconstant, occasionally, by 
now SQL-9/SQL-28/SQL-47 encounters the problem in different clusters.
   
   **Appearance:**
   An example as below, 3TB data generated with tpcds-tool, when executing with 
'select ss_sold_time_sk, ss_item_sk, ss_customer_sk, ss_cdemo_sk, ss_hdemo_sk, 
ss_addr_sk, ss_store_sk, ss_promo_sk, ss_ticket_number, ss_quantity, 
ss_wholesale_cost, ss_list_price, ss_sales_price, ss_ext_discount_amt, 
ss_ext_sales_price, ss_ext_wholesale_cost, ss_ext_list_price, ss_ext_tax, 
ss_coupon_amt, ss_net_paid, ss_net_paid_inc_tax, ss_net_profit, ss_sold_date_sk 
from store_sales where ss_item_sk=302314 and ss_customer_sk=11587351 and 
ss_quantity=24;',we got result as below:
   
![image](https://github.com/apache/orc/assets/58206775/5c8bc640-df2f-47b2-af7d-dcdad44a9383)
   The probleam here is ss_list_price/ss_ext_sales_price actual value is 
15.81/7.44, but **we get NULL**
   
   **Analize Info:**
   
![image](https://github.com/apache/orc/assets/58206775/b6ec94ec-7be2-415b-95a9-809acc28bb4c)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to