jihoonson commented on a change in pull request #10383:
URL: https://github.com/apache/druid/pull/10383#discussion_r521857100
##########
File path:
indexing-service/src/test/java/org/apache/druid/indexing/overlord/sampler/InputSourceSamplerTest.java
##########
@@ -1163,6 +1172,121 @@ public void testIndexParseException() throws IOException
);
}
+ /**
+ *
+ * This case tests sampling for multiple json lines in one text block
+ * Currently only RecordSupplierInputSource supports this kind of input, see
https://github.com/apache/druid/pull/10383 for more information
+ *
+ * This test combines illegal json block and legal json block together to
verify:
+ * 1. all lines in the illegal json block should not be parsed
+ * 2. the illegal json block should not affect the processing of the 2nd
record
+ * 3. all lines in legal json block should be parsed successfully
+ *
+ */
+ @Test
+ public void testMultipleJsonStringInOneBlock() throws IOException
+ {
+ if (!ParserType.STR_JSON.equals(parserType) || !useInputFormatApi) {
+ return;
+ }
+
+ final TimestampSpec timestampSpec = new TimestampSpec("t", null, null);
+ final DimensionsSpec dimensionsSpec = new DimensionsSpec(
+ ImmutableList.of(StringDimensionSchema.create("dim1PlusBar"))
+ );
+ final TransformSpec transformSpec = new TransformSpec(
+ null,
+ ImmutableList.of(new ExpressionTransform("dim1PlusBar", "concat(dim1 +
'bar')", TestExprMacroTable.INSTANCE))
+ );
+ final AggregatorFactory[] aggregatorFactories = {new
LongSumAggregatorFactory("met1", "met1")};
+ final GranularitySpec granularitySpec = new UniformGranularitySpec(
+ Granularities.DAY,
+ Granularities.HOUR,
+ true,
+ null
+ );
+ final DataSchema dataSchema = createDataSchema(
+ timestampSpec,
+ dimensionsSpec,
+ aggregatorFactories,
+ granularitySpec,
+ transformSpec
+ );
+
+ List<String> jsonBlockList = ImmutableList.of(
+ // include the line which can't be parsed into JSON object to form a
illegal json block
+ String.join("", STR_JSON_ROWS),
+
+ // exclude the last line to form a legal json block
+ String.join("", STR_JSON_ROWS.stream().limit(STR_JSON_ROWS.size() -
1).collect(Collectors.toList()))
Review comment:
> Every time I push a commit to a branch that is being merge, I run the
test cases in the module which contains the changes in that commit. If it's OK,
I'll push the commit. But I find that there's a high probability that CI fails.
Sometime it's related to inspection check, sometimes it's caused by failures of
test cases in other modules, sometime it's about dependency check, sometime it
has something with license check.
>
> I wonder what steps do you follow to check before push a commit ? Do you
run all the cases in all modules ? Or is there a simple way to run the checks
mentioned above ?
@FrankChen021 That's what I usually do as well. To be honest, unexpected CI
failures make me annoyed too :sweat_smile: You can run those checks on your own
by running the same command as what Travis runs. You may want to set up some
pre-commit/post-commit hooks. The best would be some automatic correction for
trivial issues, but I'm not sure if there is such a tool available which is
matured and reliable enough.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]