Re: Cannot run query containing inner join on Phoenix 3.0.0 when data size increased from 10 MB to 1 GB

Maryann Xue Mon, 24 Feb 2014 16:53:56 -0800

Could you please give the complete test case so that I can repo the issue?


On Mon, Feb 24, 2014 at 7:48 PM, David Wang <[email protected]> wrote:

> Hi Maryann,
>
> I tried moving the lineitem table (the largest table) to the first place
> in the query below:
> select n_name, sum(l_extendedprice * (1 - l_discount)) as revenue from
> lineitem inner join orders on l_orderkey = o_orderkey inner join supplier
> on l_suppkey = s_suppkey inner join customer on c_nationkey = s_nationkey
> and c_custkey = o_custkey inner join nation on s_nationkey = n_nationkey
> inner join region on n_regionkey = r_regionkey where r_name = 'AMERICA' and
> o_orderdate >= '1993-01-01' and o_orderdate < '1994-01-01' group by n_name
> order by revenue desc
>
>
> But when I execute I get the following error:
>
> java.lang.RuntimeException:
> com.salesforce.phoenix.exception.PhoenixIOException:
> com.salesforce.phoenix.exception.PhoenixIOException: Failed after
> attempts=14, exceptions:
> Mon Feb 24 19:36:50 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId: �@2[]
> Mon Feb 24 19:36:51 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId: �@2[]
> Mon Feb 24 19:36:52 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId: �@2[]
> Mon Feb 24 19:36:54 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId: �@2[]
> Mon Feb 24 19:36:56 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId: �@2[]
> Mon Feb 24 19:37:00 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId: �@2[]
> Mon Feb 24 19:37:04 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId: �@2[]
> Mon Feb 24 19:37:12 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId: �@2[]
> Mon Feb 24 19:37:28 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId: �@2[]
> Mon Feb 24 19:38:00 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId: i?�0��
> Mon Feb 24 19:39:05 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId: i?�0��
> Mon Feb 24 19:40:09 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId: i?�0��
> Mon Feb 24 19:41:13 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId: i?�0��
> Mon Feb 24 19:42:18 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId: i?�0��
>
>         at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2440)
>         at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2074)
>         at sqlline.SqlLine.print(SqlLine.java:1735)
>         at sqlline.SqlLine$Commands.execute(SqlLine.java:3683)
>         at sqlline.SqlLine$Commands.sql(SqlLine.java:3584)
>         at sqlline.SqlLine.dispatch(SqlLine.java:821)
>         at sqlline.SqlLine.begin(SqlLine.java:699)
>         at sqlline.SqlLine.mainWithInputRedirection(SqlLine.java:441)
>         at sqlline.SqlLine.main(SqlLine.java:424)
>
>
> Do you know of any way I can fix this?
>
> Thank you so much,
>
> David
>
>
> On Mon, Feb 24, 2014 at 7:02 PM, Maryann Xue <[email protected]>wrote:
>
>> Hi David,
>>
>> Since join is implemented through hash-join and table stats are currently
>> unavailable yet to help better decide the join order (and which tables
>> should be small enough to go into memory), users may sometimes have to be
>> careful about the table order in the query.
>> In your case, you can try moving the lineitem table (which is the largest
>> here) to the first place in the query. Think that should work.
>>
>>
>> Thanks,
>> Maryann
>>
>>
>>
>> On Mon, Feb 24, 2014 at 6:08 PM, David Wang <[email protected]>wrote:
>>
>>> Hi Maryann,
>>>
>>> The sizes of each table in my query are as follows:
>>> customer - 24 MB
>>> orders - 164 MB
>>> lineitem - 725 MB
>>> supplier - 1.4 MB
>>> nation - 2.2 KB
>>> region - 400 B
>>>
>>> The heap size of my region servers is 4 GB.
>>>
>>> Thank you,
>>> David
>>>
>>>
>>> On Mon, Feb 24, 2014 at 5:58 PM, Maryann Xue <[email protected]>wrote:
>>>
>>>> Hi David,
>>>>
>>>> What are sizes of each table in your query? And what is the heap size
>>>> of your region server?
>>>>
>>>>
>>>> Thanks,
>>>> Maryann
>>>>
>>>>
>>>> On Mon, Feb 24, 2014 at 5:50 PM, David Wang <[email protected]>wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I had a question about running queries containing inner joins on
>>>>> phoenix, but wasn't sure where to post.
>>>>> I downloaded the phoenix-master (3.0.0), and inserted 10 MB of data
>>>>> through psql.sh.  I found I was able to run a query,
>>>>> which contains an inner join in sqlline.
>>>>>
>>>>> The query I tried:
>>>>>
>>>>> select n_name, sum(l_extendedprice * (1 - l_discount)) as revenue from
>>>>> customer inner join orders on c_custkey = o_custkey inner join lineitem on
>>>>> l_orderkey = o_orderkey inner join supplier on l_suppkey = s_suppkey and
>>>>> c_nationkey = s_nationkey inner join nation on s_nationkey = n_nationkey
>>>>> inner join region on n_regionkey = r_regionkey where r_name = 'AMERICA' 
>>>>> and
>>>>> o_orderdate >= '1993-01-01' and o_orderdate < '1994-01-01' group by n_name
>>>>> order by revenue desc;
>>>>>
>>>>> But when I increased the data size from 10 MB to 1 GB, and try to run
>>>>> the same query, I get the following error:
>>>>> Error: Encountered exception in hash plan execution. (state=,code=0)
>>>>>
>>>>> I have included the path to snappy-java-1.0.3.2.jar in sqlline.sh like
>>>>> the following:
>>>>> java -cp
>>>>> ".:$phoenix_client_jar:/usr/local/hbase-0.94.8/lib/snappy-java-1.0.3.2.jar"
>>>>>  -Dlog4j.configuration=file:$current_dir/log4j.properties
>>>>> sqlline.SqlLine
>>>>> -d com.salesforce.phoenix.jdbc.PhoenixDriver -u jdbc:phoenix:$1 -n
>>>>> none -p
>>>>> none --color=true --fastConnect=false --verbose=true
>>>>> --isolation=TRANSACTION_READ_COMMITTED $sqlfile
>>>>>
>>>>> Do you have any idea where the problem might be?
>>>>>
>>>>> Thank you so much,
>>>>>
>>>>> David
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>> Maryann
>>>>
>>>
>>>
>>
>>
>> --
>> Thanks,
>> Maryann
>>
>
>


-- 
Thanks,
Maryann

Re: Cannot run query containing inner join on Phoenix 3.0.0 when data size increased from 10 MB to 1 GB

Reply via email to