Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/6738#issuecomment-111199172
  
    For the UnsafeFixedWidthAggregationMap suite:
    
    ```
    [info] - inserting large random keys *** FAILED *** (176 milliseconds)
    [info]   461 was not equal to 512 
(UnsafeFixedWidthAggregationMapSuite.scala:121)
    [info]   org.scalatest.exceptions.TestFailedException:
    [info]   at 
org.scalatest.MatchersHelper$.newTestFailedException(MatchersHelper.scala:160)
    [info]   at 
org.scalatest.Matchers$ShouldMethodHelper$.shouldMatcher(Matchers.scala:6231)
    [info]   at 
org.scalatest.Matchers$AnyShouldWrapper.should(Matchers.scala:6265)
    [info]   at 
    ```
    
    Here's the source of that test:
    
    ```scala
     test("inserting large random keys") {
        val map = new UnsafeFixedWidthAggregationMap(
          emptyAggregationBuffer,
          aggBufferSchema,
          groupKeySchema,
          memoryManager,
          128, // initial capacity
          false // disable perf metrics
        )
        val rand = new Random(42)
        val groupKeys: Set[String] = Seq.fill(512)(rand.nextString(1024)).toSet
        groupKeys.foreach { keyString =>
          map.getAggregationBuffer(new 
GenericRow(Array[Any](UTF8String.fromString(keyString))))
        }
        val seenKeys: Set[String] = map.iterator().asScala.map { entry =>
          entry.key.getString(0)
        }.toSet
        seenKeys.size should be (groupKeys.size) <- this is the test that's 
failing
        seenKeys should be (groupKeys)
      }
    ```
    
    Maybe we can add some print statements / logging to figure out what strings 
are being returned?  Based on the failure message, I wonder whether there's an 
issue with `equals()` or `hashCode()` but I can't seem to spot anything that's 
obviously wrong.  One culprit to watch out for would might be `.equals()` vs 
`==` in Java vs. Scala, but that's kind of a long shot I think.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to