[GitHub] spark pull request: [SPARK-8968] [SQL] external sort by the partit...

scwf Wed, 20 Jan 2016 07:43:17 -0800

Github user scwf commented on a diff in the pull request:

    https://github.com/apache/spark/pull/7336#discussion_r50269169
  
    --- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveWriterContainers.scala ---
    @@ -198,33 +241,99 @@ private[spark] class 
SparkHiveDynamicPartitionWriterContainer(
         conf.value.setBoolean(SUCCESSFUL_JOB_OUTPUT_DIR_MARKER, oldMarker)
       }
     
    -  override def getLocalFileWriter(row: InternalRow, schema: StructType)
    -    : FileSinkOperator.RecordWriter = {
    -    def convertToHiveRawString(col: String, value: Any): String = {
    -      val raw = String.valueOf(value)
    -      schema(col).dataType match {
    -        case DateType => DateTimeUtils.dateToString(raw.toInt)
    -        case _: DecimalType => BigDecimal(raw).toString()
    -        case _ => raw
    -      }
    +  // this function is executed on executor side
    +  override def writeToFile(context: TaskContext, iterator: 
Iterator[InternalRow]): Unit = {
    +    val serializer = newSerializer(fileSinkConf.getTableInfo)
    +    val standardOI = ObjectInspectorUtils
    +      .getStandardObjectInspector(
    +        fileSinkConf.getTableInfo.getDeserializer.getObjectInspector,
    +        ObjectInspectorCopyOption.JAVA)
    +      .asInstanceOf[StructObjectInspector]
    +
    +    val fieldOIs = 
standardOI.getAllStructFieldRefs.asScala.map(_.getFieldObjectInspector).toArray
    +    val dataTypes = inputSchema.map(_.dataType)
    +    val wrappers = fieldOIs.zip(dataTypes).map { case (f, dt) => 
wrapperFor(f, dt) }
    +    val outputData = new Array[Any](fieldOIs.length)
    --- End diff --
    
    yes, extracted a common method for it.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8968] [SQL] external sort by the partit...

Reply via email to