Hey guys,

Is there any performance benchmark for PXF interface? I would like to study 
what is the overhead when performing a big tablescan by communicating through 
PXF REST interface.


It seems there's no Parquet HDFS plugin, so there's no direct way to do 
head-to-head comparison with/without PXF framework.


Is there any internal benchmark result to share?


Also, since I haven't seen any detailed documents about how exactly PXF works, 
can you correct me if I'm wrong?
In my understanding, bankend/access/external is the main component to handle 
PXF calls, so any external table access will invoke this module to send request 
to local PXF-SERVICE ( where the master node locate ). PXF-SERVICE is 
responsible to pickup the correct java libraries and construct filters. It will 
first attempt to get fragments, and then assign fragments to each Segment 
process ( try to match the hostname for data locality ), each Segment process 
is going to talk with local PXF-SERVER and calls Accessor class in order to 
fetch data from external storage, then pass back the result to Segment process 
through REST API.


Is my understanding correct?


Cheers

Reply via email to