Near created BEAM-6874:
--------------------------

             Summary: HCatalogTableProvider always read all rows
                 Key: BEAM-6874
                 URL: https://issues.apache.org/jira/browse/BEAM-6874
             Project: Beam
          Issue Type: Bug
          Components: io-java-hcatalog
    Affects Versions: 2.11.0
            Reporter: Near


Hi,

I'm using HCatalogTableProvider while doing SqlTransform.query. The query is 
something like "select * from `hive`.`table_name` limit 10". Despite of the 
limit clause, the data source still reads much more rows (the data of Hive 
table are files on S3), even more than the number of rows in one file (or 
partition).

 

Some more details:
 # It is running on Flink.
 # I actually implemented my own HiveTableProvider because HCatalogBeamSchema 
only supports primitive types. However, the table provider works when I query a 
small table with ~1k rows.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to