If possible, could you please some sample data for your error case?

I tried to re-produce the problem, by copying the tpch sample
nation.parquet into two different directory (nation, nation2), and the
query joining the two directory seems to work fine.

ls nation1 nation2
nation1:
nation.parquet

nation2:
nation.parquet


select * from dfs.tmp.`nation1` t1 join dfs.tmp.`nation2` t2 on
t1.n_nationkey = t2.n_nationkey;
+--------------+--------------+--------------+--------------+---------------+--------------+---------------+--------------+
| N_NATIONKEY  |    N_NAME    | N_REGIONKEY  |  N_COMMENT   | N_NATIONKEY0
 |   N_NAME0    | N_REGIONKEY0  |  N_COMMENT0  |
+--------------+--------------+--------------+--------------+---------------+--------------+---------------+--------------+
...
25 rows selected (0.31 seconds)


select * from dfs.tmp.`nation1` t1, dfs.tmp.`nation2` t2 where
t1.n_nationkey = t2.n_nationkey;
+--------------+--------------+--------------+--------------+---------------+--------------+---------------+--------------+
| N_NATIONKEY  |    N_NAME    | N_REGIONKEY  |  N_COMMENT   | N_NATIONKEY0
 |   N_NAME0    | N_REGIONKEY0  |  N_COMMENT0  |
+--------------+--------------+--------------+--------------+---------------+--------------+---------------+--------------+
...
25 rows selected (0.289 seconds)


My guess is the error might be related to the field structure in your
parquet file. How do you create your parquet used in your query?



On Wed, Jul 22, 2015 at 7:25 AM, Usman Ali <[email protected]>
wrote:

> Hi,
>      I am trying to take a join on two directories which contain parquet
> files. My query reads:
> *select * from hdfs.root.`parquet1` as t1 join hdfs.root.`parquet2` as t2
> on t1.field1= t2.field1;*
> (parquet1 and parquet2 directories contain parquet files in them)
>
>  It gives an error saying  Field References Must be Singular Names.
>
> However, when I select only some of the fields it works fine:
> *select  t1.field2, t2.filed2 from hdfs.root.`parquet1` as t1 join
> hdfs.root.`parquet2` as t2 on t1.field1= t2.field1;  (This works fine)*
>
> Surprisingly when I run following query (on parquet files rather then
> directories) it again works fine
> *select * from hdfs.root.`file1.parquet` as t1 join
> hdfs.root.`file2.parquet` as t2 on t1.field1= t2.field1;*
>
>  Someone, help me please.
>
> Regards,
> Usman Ali
>

Reply via email to