On Thu, Feb 25, 2010 at 1:19 PM, Ning Zhang <[email protected]> wrote:
> Edward,
> Multi-table joins are practical and we do a lot of these here. If the OOM
> exception was thrown from a regular reduce-side join, it may be caused by
> skewness in your join keys.
> From branch-0.5 and forward, you will have a parameter hive.join.cache.size
> to control how many rows you want to put into cache when doing joins. Before
> branch-0.5, what we do for regular reduce-side join is that, for each join
> key, we will read all rows from the small tables (the ones except the
> rightmost table) in main memory and join with the rightmost table (a.k.a the
> streming table). Note this is the for each distinct join key. If you have a
> lot of rows in the small tables but each one has small # of rows for each
> join key, the memory consumption is not that large. But if you have skewness
> in your join key, you may see OOM exceptions, particularly when you joining
> multiple tables in the same join operator.
> For map-side join pre branch 0.5, we read all rows from the small table
> specified by the mapjoin hint, and build a hashmap on the join key. From
> branch 0.5, you can also specify a parameter hive.mapjoin.max.numrows to
> specify how many rows you want to read into cache for each map-joined
> table.
> Thanks,
> Ning
>
> @Ning
>
> Map-join works with persistent data structure staring from branch 0.5. We
> will keep a cache >>using HashMap in main memory and you can specify the #
> of rows to be put in the cache. If it >>is too large, data will be spilled
> to disk-based hash table.
>
> So I have a problem, I am trying to join 5 large tables. My pageview
> table is partitioned by day ~6GB but I have 4 other tables that can be
> about 1GB each.
>
> Currently joining the 5 tables throws OOM.
>
> I see only two ways I can solve this.
> 1) break my joins up into stages, each stage is written to a hive table
>
> @Ning. or anyone else
>
> With the enhancements to mapjoin in 5.0, will my 5 table join work
> (theoretically) if I use mapjoin? Speed is not really an issue here,
> but if joining 2+ tables is not practical in most cases I need to
> start planning for this now.
>
> Thanks
>
>
Ning,
Thank you for your reply.
> Note this is the for each distinct join key. If you have a
> lot of rows in the small tables but each one has small # of rows for each
> join key, the memory consumption is not that large
I believe this is the opposite of my problem. We have a star schema.
The main table is
pageviews the other tables are pages, client_ips, etc.
Thus each row in 'files' table matches many rows in the large table.
Also the joins are not on the same key, each join is on a different
column.
....
from client_ips c
join pageviews p on p.clientip_id = c.id
join files f on p.file_id = f.id
....
I upgraded to 5.0-rc0 and the query ran out of the box!!!!. Not sure
if hive.join.cache.size helped but we noticed that hive may have
chosen more mapper/reducers then 4.0 did.
We had many :
java.io.IOException: Filesystem closed
2010-02-25 13:42:15,168 INFO org.apache.hadoop.mapred.MapTask: Finished spill 22
2010-02-25 13:42:16,474 INFO
org.apache.hadoop.hive.ql.exec.MapOperator: 6 finished. closing...
2010-02-25 13:42:16,474 INFO
org.apache.hadoop.hive.ql.exec.MapOperator: 6 forwarded 6271391 rows
2010-02-25 13:42:16,474 INFO
org.apache.hadoop.hive.ql.exec.TableScanOperator: 2 finished.
closing...
2010-02-25 13:42:16,474 INFO
org.apache.hadoop.hive.ql.exec.TableScanOperator: 2 forwarded 0 rows
2010-02-25 13:42:16,474 INFO
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator: 3 finished.
closing...
2010-02-25 13:42:16,474 INFO
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator: 3 forwarded 0 rows
2010-02-25 13:42:16,474 INFO
org.apache.hadoop.hive.ql.exec.TableScanOperator: 2 Close done
2010-02-25 13:42:16,474 INFO
org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0
2010-02-25 13:42:16,474 INFO
org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 finished.
closing...
2010-02-25 13:42:16,474 INFO
org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarded 6271391
rows
2010-02-25 13:42:16,474 INFO
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator: 1 finished.
closing...
2010-02-25 13:42:16,474 INFO
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator: 1 forwarded 0 rows
2010-02-25 13:42:16,474 INFO
org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 Close done
2010-02-25 13:42:16,474 INFO
org.apache.hadoop.hive.ql.exec.MapOperator: 6 Close done
2010-02-25 13:42:16,474 INFO ExecMapper: ExecMapper: processed 6271391
rows: used memory = 110014976
2010-02-25 13:42:16,479 WARN org.apache.hadoop.mapred.TaskTracker:
Error running child
java.io.IOException: Filesystem closed
at org.apache.hadoop.dfs.DFSClient.checkOpen(DFSClient.java:195)
at org.apache.hadoop.dfs.DFSClient.access$600(DFSClient.java:59)
at
org.apache.hadoop.dfs.DFSClient$DFSInputStream.close(DFSClient.java:1359)
at java.io.FilterInputStream.close(FilterInputStream.java:155)
at
org.apache.hadoop.io.SequenceFile$Reader.close(SequenceFile.java:1587)
at
org.apache.hadoop.mapred.SequenceFileRecordReader.close(SequenceFileRecordReader.java:126)
at
org.apache.hadoop.hive.ql.io.HiveRecordReader.close(HiveRecordReader.java:36)
at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.close(MapTask.java:173)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:231)
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2198)
They did retry and continue but this had a big effect on the overall
job time. Looking into this. I am wondering if this is just flaky 18.3
isms that an upgrade would magically cure.