----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18925/#review36586 -----------------------------------------------------------
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/AbstractParquetMapInspector.java <https://reviews.apache.org/r/18925/#comment67592> I guess I wasn't clear. It's not inapproapriate, but goes beyond its responsibility. Equality implementation is within a context which is the class. The instance to be checked doesn't necessarily has the runtime class info. In fact, the context shouldn't be aware the runtime class of these instances, as child classes can be added any time. Doing getClass == other.getClass() goes beyond the current context. What's more appropriate to check type compatibility by calling if (other instanceof this.class). This is different from checking this.getClass() == other.getClass(). Take Java ArrayList.equals() method as an example. (http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/AbstractList.java#AbstractList.equals%28java.lang.Object%29). This method doesn't do runtime class check. The implementation is saying, other.getClass() doesn't have to be ArrayList, but has to be an instance of ArrayList. It could be an instance of MyArrayList as long as MyArrayList is inherited from ArrayList. If we think it's more protical to do such a check, we'd expect that ArrayList.equals() would also check this.getClass() == other.getClass(). Btw, I don't understand how it breaks transitivity by removing this check. I understand this check was there before your change. I missed it in my previous review. - Xuefu Zhang On March 8, 2014, 12:01 a.m., Szehon Ho wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/18925/ > ----------------------------------------------------------- > > (Updated March 8, 2014, 12:01 a.m.) > > > Review request for hive, Brock Noland, justin coffey, and Xuefu Zhang. > > > Repository: hive-git > > > Description > ------- > > The issue is, as part of select * query, a DeepParquetHiveMapInspector is > used for one column of an overall parquet-table struct object inspector. > > The problem lies in the ObjectInspectorFactory's cache for struct object > inspector. For performance, there is a cache keyed on an array list, of all > object inspectors of columns. The second time the query is run, it attempts > to lookup cached struct inspector. But when the hashmap looks up the part of > the key consisting of the DeepParquetHiveMapInspector, java calls .equals > against the existing DeepParquetHivemapInspector. This fails, as the .equals > method casted the "other" to a "StandardParquetHiveInspector". > > Regenerating the .equals and .hashcode from eclipse. > > Also adding one more check in .equals before casting, to handle the case if > another class of object inspector gets hashed to the same hashcode in the > cache. Then java would call .equals against the other, which in this case is > not of the same class. > > > Diffs > ----- > > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/AbstractParquetMapInspector.java > 1d72747 > > Diff: https://reviews.apache.org/r/18925/diff/ > > > Testing > ------- > > Manual testing. > > > Thanks, > > Szehon Ho > >