This should be fixed as part of https://issues.apache.org/jira/browse/HIVE-405
Can you try in the trunk version and let us know if you still see any problems ? From: Jason Michael [mailto:[email protected]] Sent: Tuesday, July 14, 2009 3:01 PM To: hive mailing list Subject: NPEs when using map-side join >From a previous thread I learned about map side joins. This sounds like >exactly what we need since we are typically joining several small tables to a >larger fact table. However, I'm getting some NPEs when trying to use them in >a query. The tables look something like: hive> describe dim_1; OK key int dim_2_key int startdate string enddate string lastmodified string hive> describe dim_2; OK key int dim_3_key int name string <...> lastmodified string hive> describe dim_3; OK key int name string description string lastmodified string So there is a one-to-many relationship from dim_3 to dim_2, and from dim_2 to dim_1. Each table contains a few thousand rows. The fact table contains many millions of rows and looks something like: hive> describe fact; OK fact_key string measure1 int measure2 int measure3 double dim_1_key int day string hour int Not every record in the fact table has a dim_1_key. That is, it is sometimes null. The query I'm trying to run looks something like: select /*+ MAPJOIN(dim_1, dim_2, dim_3) */ dim_3.name, sum(measure1), sum(measure2) from fact join dim_1 on fact.dim_1_key = dim_1.key join dim_2 on dim_2.key = dim_1.dim_2_key join dim_3 on dim_3.key = dim_2.dim_3_key where fact.day='20090601' and fact.hour = 1 group by dim_3.name; And finally the error I'm getting in some of the mappers is: java.lang.RuntimeException: Error while closing operators at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:208) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2210) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.close(MapJoinOperator.java:333) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:383) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:383) at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:188) ... 3 more Any help greatly appreciated! Jason
