> On Feb. 9, 2015, 2:51 a.m., Rui Li wrote: > > > > Rui Li wrote: > Some high level question, do we still need two buffers? And does it make > sense to use something like a queue instead of an array as the buffer?
Queue should work too. Using too buffers makes it easier to switch between read and write. Switching itself is cheap here. For RowContainer, it is expensive to switch because of first()/clear(), etc. > On Feb. 9, 2015, 2:51 a.m., Rui Li wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java, > > line 54 > > <https://reviews.apache.org/r/30739/diff/4/?file=853475#file853475line54> > > > > If I understand correctly, this can be renamed to something like > > IN_MEMORY_NUM_ROWS? Yes, you are right. Both are ok. Any strong reason for renaming it? > On Feb. 9, 2015, 2:51 a.m., Rui Li wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java, > > line 76 > > <https://reviews.apache.org/r/30739/diff/4/?file=853475#file853475line76> > > > > Do we need a parameter here? Seems it can just use writeCursor? You are right. It is good to use writeCursor. > On Feb. 9, 2015, 2:51 a.m., Rui Li wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java, > > line 236 > > <https://reviews.apache.org/r/30739/diff/4/?file=853475#file853475line236> > > > > I suppose this is to avoid frequent switch buffer? But why the magic > > number 1? Right. If it is 1, there is no need to switch buffer. For other number, we need to switch anyway. I assume there are many scenarios that there is just one row. - Jimmy ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30739/#review71597 ----------------------------------------------------------- On Feb. 7, 2015, 3:09 a.m., Jimmy Xiang wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/30739/ > ----------------------------------------------------------- > > (Updated Feb. 7, 2015, 3:09 a.m.) > > > Review request for hive, Rui Li and Xuefu Zhang. > > > Bugs: HIVE-9574 > https://issues.apache.org/jira/browse/HIVE-9574 > > > Repository: hive-git > > > Description > ------- > > Result KV cache doesn't use RowContainer any more since it has logic we don't > need, which is some overhead. We don't do lazy computing right away, instead > we wait a little till the cache is close to spill. > > > Diffs > ----- > > > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java > 78ab680 > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java > 8ead0cb > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveMapFunction.java > 7a09b4d > > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveMapFunctionResultList.java > e92e299 > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveReduceFunction.java > 070ea4d > > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveReduceFunctionResultList.java > d4ff37c > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/KryoSerializer.java > 286816b > ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestHiveKVResultCache.java > 0df4598 > > Diff: https://reviews.apache.org/r/30739/diff/ > > > Testing > ------- > > Unit test, test on cluster > > > Thanks, > > Jimmy Xiang > >