Re: Difference between hashJoin and innerJoin in Streaming Expression
Hi Joel, Thanks for the information. Regards, Edwin On 25 March 2017 at 10:15, Joel Bernsteinwrote: > The innerJoin is a merge join and the hashJoin is a hash join. > > The merge join can support joins of unlimited size and never runs out of > memory. But it requires that both sides of the join are sorted on the join > keys. > > The hash join reads one side of the join into a hash map keyed on the join > keys. This doesn't require any specific sort but it is limited in size by > how much data can fit in the hash map. > > You can parallelize both joins using the parallel function to improve > scalability and performance. > > Joel Bernstein > http://joelsolr.blogspot.com/ > > On Fri, Mar 24, 2017 at 4:49 AM, Zheng Lin Edwin Yeo > > wrote: > > > Hi, > > > > What is the main difference between hashJoin and innerJoin in Solr > > Streaming Expression? > > > > I understand that both will emit a tuple containing the fields of both > > tuples. > > > > When I tried both hashJoin and innerJoin with the same query, I get > exactly > > the same results, and there is no difference in performance. > > > > Under what circumstances should we use hashJoin, and under what > > circumstances should we use innerJoin? > > > > Regards, > > Edwin > > >
Re: Difference between hashJoin and innerJoin in Streaming Expression
The innerJoin is a merge join and the hashJoin is a hash join. The merge join can support joins of unlimited size and never runs out of memory. But it requires that both sides of the join are sorted on the join keys. The hash join reads one side of the join into a hash map keyed on the join keys. This doesn't require any specific sort but it is limited in size by how much data can fit in the hash map. You can parallelize both joins using the parallel function to improve scalability and performance. Joel Bernstein http://joelsolr.blogspot.com/ On Fri, Mar 24, 2017 at 4:49 AM, Zheng Lin Edwin Yeowrote: > Hi, > > What is the main difference between hashJoin and innerJoin in Solr > Streaming Expression? > > I understand that both will emit a tuple containing the fields of both > tuples. > > When I tried both hashJoin and innerJoin with the same query, I get exactly > the same results, and there is no difference in performance. > > Under what circumstances should we use hashJoin, and under what > circumstances should we use innerJoin? > > Regards, > Edwin >
Difference between hashJoin and innerJoin in Streaming Expression
Hi, What is the main difference between hashJoin and innerJoin in Solr Streaming Expression? I understand that both will emit a tuple containing the fields of both tuples. When I tried both hashJoin and innerJoin with the same query, I get exactly the same results, and there is no difference in performance. Under what circumstances should we use hashJoin, and under what circumstances should we use innerJoin? Regards, Edwin