Re: InterpretedUnsafeProjection - error in getElementSize

2020-07-24 Thread Wenchen Fan
Can you create a JIRA ticket? There are many people happy to help to fix it.

On Tue, Jul 21, 2020 at 9:49 PM Janda Martin  wrote:

> Hi,
>   I think that I found error in
> InterpretedUnsafeProjection::getElementSize. This method differs from
> similar implementation in GenerateUnsafeProjection.
>
>  InterpretedUnsafeProjection::getElementSize - returns wrong size for
> UDTs. I suggest to use similar code from GenerateUnsafeProjection.
>
>
> Test type:
> new ArrayType(new CustomUDT())
>
> CustomUDT with sqlType=StringType
>
>
>
>   Thank you
>  Martin
>
>
> InterpretedUnsafeProjection implementation:
>
>   private def getElementSize(dataType: DataType): Int = dataType match {
> case NullType | StringType | BinaryType | CalendarIntervalType |
>  _: DecimalType | _: StructType | _: ArrayType | _: MapType => 8
> case _ => dataType.defaultSize
>   }
>
>
> GenerateUnsafeProjection implementation:
>
> val et = UserDefinedType.sqlType(elementType)
> ...
>
> val elementOrOffsetSize = et match {
>   case t: DecimalType if t.precision <= Decimal.MAX_LONG_DIGITS => 8
>   case _ if CodeGenerator.isPrimitiveType(jt) => et.defaultSize
>   case _ => 8  // we need 8 bytes to store offset and length
> }
>
>
>
> PS Following line is not necessary because DecimalType is not primitive
> type - so it should be covered by default size=8.
>
>   case t: DecimalType if t.precision <= Decimal.MAX_LONG_DIGITS => 8
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


InterpretedUnsafeProjection - error in getElementSize

2020-07-21 Thread Janda Martin
Hi,
  I think that I found error in InterpretedUnsafeProjection::getElementSize. 
This method differs from similar implementation in GenerateUnsafeProjection.

 InterpretedUnsafeProjection::getElementSize - returns wrong size for UDTs. I 
suggest to use similar code from GenerateUnsafeProjection.


Test type:
new ArrayType(new CustomUDT())

CustomUDT with sqlType=StringType



  Thank you
 Martin


InterpretedUnsafeProjection implementation:

  private def getElementSize(dataType: DataType): Int = dataType match {
case NullType | StringType | BinaryType | CalendarIntervalType |
 _: DecimalType | _: StructType | _: ArrayType | _: MapType => 8
case _ => dataType.defaultSize
  }


GenerateUnsafeProjection implementation:

val et = UserDefinedType.sqlType(elementType)
...

val elementOrOffsetSize = et match {
  case t: DecimalType if t.precision <= Decimal.MAX_LONG_DIGITS => 8
  case _ if CodeGenerator.isPrimitiveType(jt) => et.defaultSize
  case _ => 8  // we need 8 bytes to store offset and length
}



PS Following line is not necessary because DecimalType is not primitive type - 
so it should be covered by default size=8.

  case t: DecimalType if t.precision <= Decimal.MAX_LONG_DIGITS => 8

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org