taiyang-li commented on PR #1375:
URL: https://github.com/apache/orc/pull/1375#issuecomment-1754699373

   @wpleonardo Do we have any performance benchmark about this PR?   
@alexey-milovidov Maybe you are interested in it. 
   
   I try to use this feature in 
clickhouse(https://github.com/clickHouse/ClickHouse), but can't see any 
performance improvement. 
   
   Q: `select *  from  
file('/data1/clickhouse_official/data/user_files/bigolive_audience_stats_orc.orc')
 format Null;`
   
   With AVX512: 
   ```
   0 rows in set. Elapsed: 3.659 sec. Processed 1.13 million rows, 486.19 MB 
(308.68 thousand rows/s., 132.88 MB/s.)
   0 rows in set. Elapsed: 3.653 sec. Processed 1.20 million rows, 517.87 MB 
(329.40 thousand rows/s., 141.76 MB/s.)
   0 rows in set. Elapsed: 3.719 sec. Processed 1.13 million rows, 486.19 MB 
(303.70 thousand rows/s., 130.74 MB/s.)
   ```
   
   Without AVX512
   ```
   0 rows in set. Elapsed: 3.565 sec. Processed 1.13 million rows, 486.19 MB 
(316.81 thousand rows/s., 136.38 MB/s.)
   0 rows in set. Elapsed: 3.540 sec. Processed 1.20 million rows, 517.87 MB 
(339.91 thousand rows/s., 146.28 MB/s.)
   0 rows in set. Elapsed: 3.681 sec. Processed 1.20 million rows, 517.87 MB 
(326.90 thousand rows/s., 140.69 MB/s.)
   ``` 
   
   About the test orc file: 
   ```
   $ du -sh bigolive_audience_stats_orc.orc                                     
                
   505M bigolive_audience_stats_orc.orc
   
   
   $ orc-metadata ./bigolive_audience_stats_orc.orc                           
   { "name": "./bigolive_audience_stats_orc.orc",
     "type": 
"struct<reporttime:bigint,appid:bigint,uid:bigint,platform:int,nettype:int,clientversioncode:bigint,sdkversioncode:bigint,statid:string,statversion:int,countrycode:string,language:string,model:string,osversion:string,channel:string,heartcount:int,msgcount:int,giftcount:int,barragecount:int,gid:string,entrytype:int,prefetchedms:int,linkdstate:int,networkavailable:int,starttimestamp:bigint,sessionlogints:int,medialogints:int,sdkboundts:int,msconnectedts:int,vsconnectedts:int,firstiframets:int,ownerstatus:int,stopreason:int,totaltime:int,cpuusageavg:int,memusageavg:int,backgroundtotal:bigint,foregroundtotal:bigint,firstvideopackts:int,firstvoicerecvts:int,firstvoiceplayts:int,firstiframeassemblets:int,uiinitts:int,uiloadedts:int,uiappearedts:int,setvideoviewts:int,blurviewdimissts:int,preparesdkinqueuets:int,preparesdkexects:int,startsdkinqueuets:int,startsdkexects:int,sdkjoinchannelinqueuets:int,sdkjoinchannelexects:int,lastsdkleavechannelinqueuets:int,lastsdkleavechannele
 
xects:int,unused_1:int,unused_2:int,setvideoviewinqueuets:int,setvideoviewexects:int,livetype:int,audiostatus:int,firstiframesize:bigint,firstiframedecodetime:bigint,extras:bigint,entrancetype:int,entrancemode:int,mclientip:bigint,mnc:bigint,mcc:bigint,vsipsuccess:bigint,msipsuccess:bigint,vsipfail:bigint,msipfail:bigint,mediaflag:bigint,dispatchid:string,proxyflag:int,redirectcount:int,directorrescode:int,subentrancetab:string,logininfolist:array<struct<strategy:bigint,ip:bigint,loginStat:bigint,reserve1:bigint,reserve2:bigint>>,playcentertype:int,videomutetype:bigint,owneruid:bigint,extra:string>",
     "rows": 1203317,
     "stripe count": 12,
     "format": "0.12", "writer version": "future - 9",
     "compression": "snappy", "compression block": 65536,
     "file length": 529207118,
     "content": 529182229, "stripe stats": 21150, "footer": 3712, "postscript": 
26,
     "row index stride": 10000,
     "user metadata": {
       "org.apache.spark.version": "3.3.2"
     },
     "stripes": [
       { "stripe": 0, "rows": 117760,
         "offset": 3, "length": 50876922,
         "index": 23728, "data": 50851823, "footer": 1371
       },
       { "stripe": 1, "rows": 117760,
         "offset": 50876925, "length": 50948680,
         "index": 23679, "data": 50923619, "footer": 1382
       },
       { "stripe": 2, "rows": 62050,
         "offset": 101825605, "length": 26902880,
         "index": 15322, "data": 26886211, "footer": 1347
       },
       { "stripe": 3, "rows": 117760,
         "offset": 128728485, "length": 50474083,
         "index": 24110, "data": 50448601, "footer": 1372
       },
       { "stripe": 4, "rows": 117760,
         "offset": 179202568, "length": 50413042,
         "index": 23858, "data": 50387825, "footer": 1359
       },
       { "stripe": 5, "rows": 63570,
         "offset": 229615610, "length": 27504277,
         "index": 14890, "data": 27488029, "footer": 1358
       },
       { "stripe": 6, "rows": 117760,
         "offset": 268435456, "length": 50981984,
         "index": 24191, "data": 50956424, "footer": 1369
       },
       { "stripe": 7, "rows": 117760,
         "offset": 319417440, "length": 51017894,
         "index": 23792, "data": 50992731, "footer": 1371
       },
       { "stripe": 8, "rows": 61720,
         "offset": 370435334, "length": 26840720,
         "index": 15246, "data": 26824109, "footer": 1365
       },
       { "stripe": 9, "rows": 117760,
         "offset": 397276054, "length": 49971095,
         "index": 23487, "data": 49946233, "footer": 1375
       },
       { "stripe": 10, "rows": 117760,
         "offset": 447247149, "length": 50259825,
         "index": 24090, "data": 50234369, "footer": 1366
       },
       { "stripe": 11, "rows": 73897,
         "offset": 497506974, "length": 31675255,
         "index": 16948, "data": 31656952, "footer": 1355
       }
     ]
   }
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to