wyb commented on a change in pull request #4524:
URL: https://github.com/apache/incubator-doris/pull/4524#discussion_r484063836



##########
File path: 
fe/spark-dpp/src/main/java/org/apache/doris/load/loadv2/dpp/SparkDpp.java
##########
@@ -358,12 +362,44 @@ private void processRollupTree(RollupTreeNode rootNode,
         return Pair.of(keyMap.toArray(new Integer[keyMap.size()]), 
valueMap.toArray(new Integer[valueMap.size()]));
     }
 
-    // repartition dataframe by partitionid_bucketid
-    // so data in the same bucket will be consecutive.
-    private JavaPairRDD<List<Object>, Object[]> 
fillTupleWithPartitionColumn(SparkSession spark, Dataset<Row> dataframe,
+    /**
+     *   check decimal,char/varchar
+     */
+    private boolean validateData(Object srcValue, EtlJobConfig.EtlColumn 
etlColumn, ColumnParser columnParser,Row row) {
+
+        switch (etlColumn.columnType.toUpperCase()) {
+            case "DECIMALV2":
+                // TODO(wb):  support decimal round; see be 
DecimalV2Value::round
+                DecimalParser decimalParser = (DecimalParser) columnParser;
+                BigDecimal srcBigDecimal = (BigDecimal) srcValue;
+                if (srcValue != null && 
(decimalParser.getMaxValue().compareTo(srcBigDecimal) < 0 || 
decimalParser.getMinValue().compareTo(srcBigDecimal) > 0)) {
+                    LOG.warn(String.format("decimal value is not valid for 
defination, column=%s, value=%s,precision=%s,scale=%s",
+                            etlColumn.columnName, srcValue.toString(), 
srcBigDecimal.precision(), srcBigDecimal.scale()));
+                    abnormalRowAcc.add(1);
+                    return false;
+                }
+                break;
+            case "CHAR":
+            case "VARCHAR":
+                // TODO(wb) padding char type
+                if (srcValue != null && srcValue.toString().length() > 
etlColumn.stringLength) {

Review comment:
       byte length




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to