Re: [PR] [FLINK-32993][table] Datagen connector handles length-constrained fields according to the schema definition by default [flink]
LadyForest closed pull request #23678: [FLINK-32993][table] Datagen connector handles length-constrained fields according to the schema definition by default URL: https://github.com/apache/flink/pull/23678 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-32993][table] Datagen connector handles length-constrained fields according to the schema definition by default [flink]
liyubin117 commented on code in PR #23678: URL: https://github.com/apache/flink/pull/23678#discussion_r1429393701 ## flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/connector/datagen/table/DataGenTableSourceFactory.java: ## @@ -141,15 +143,15 @@ private DataGeneratorContainer createContainer( } private void validateFieldOptions(String name, DataType type, ReadableConfig options) { -ConfigOption lenOption = +ConfigOption varLenOption = key(DataGenConnectorOptionsUtil.FIELDS + "." + name + "." + DataGenConnectorOptionsUtil.VAR_LEN) .booleanType() .defaultValue(false); -options.getOptional(lenOption) +options.getOptional(varLenOption) Review Comment: sorry to misunderstand. I invoke this to filter the table that contains ` 'fields.#.var-len' = 'true' `, then we could use `ifPresent` check if it is sufficient to use such option. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-32993][table] Datagen connector handles length-constrained fields according to the schema definition by default [flink]
LadyForest commented on code in PR #23678: URL: https://github.com/apache/flink/pull/23678#discussion_r1429375718 ## flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/connector/datagen/table/DataGenTableSourceFactory.java: ## @@ -141,15 +143,15 @@ private DataGeneratorContainer createContainer( } private void validateFieldOptions(String name, DataType type, ReadableConfig options) { -ConfigOption lenOption = +ConfigOption varLenOption = key(DataGenConnectorOptionsUtil.FIELDS + "." + name + "." + DataGenConnectorOptionsUtil.VAR_LEN) .booleanType() .defaultValue(false); -options.getOptional(lenOption) +options.getOptional(varLenOption) Review Comment: > the prior name is not clear enough, so I use a new one instead. I mean L#155 `filter(option -> option)` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-32993][table] Datagen connector handles length-constrained fields according to the schema definition by default [flink]
liyubin117 commented on PR #23678: URL: https://github.com/apache/flink/pull/23678#issuecomment-1855638485 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-32993][table] Datagen connector handles length-constrained fields according to the schema definition by default [flink]
liyubin117 commented on code in PR #23678: URL: https://github.com/apache/flink/pull/23678#discussion_r1423393030 ## flink-table/flink-table-api-java-bridge/src/test/java/org/apache/flink/table/factories/DataGenTableSourceFactoryTest.java: ## @@ -297,10 +295,104 @@ void testVariableLengthDataGeneration() throws Exception { anyCauseMatches( ValidationException.class, String.format( -"Only supports specifying '%s' option for variable-length types (varchar, string, varbinary, bytes). The type of field %s is not within this range.", +"Only supports specifying '%s' option for variable-length types (VARCHAR/STRING/VARBINARY/BYTES). The type of field '%s' is not within this range.", DataGenConnectorOptions.FIELD_VAR_LEN.key(), "f4"))); } +@Test +void testVariableLengthDataType() throws Exception { +DescriptorProperties descriptor = new DescriptorProperties(); +final int rowsNumber = 200; +descriptor.putString(FactoryUtil.CONNECTOR.key(), "datagen"); +descriptor.putLong(DataGenConnectorOptions.NUMBER_OF_ROWS.key(), rowsNumber); + +List results = runGenerator(LENGTH_CONSTRAINED_SCHEMA, descriptor); +assertThat(results).hasSize(rowsNumber); + +for (RowData row : results) { +assertThat(row.getString(2).toString()).hasSize(30); +assertThat(row.getBinary(3)).hasSize(20); +assertThat(row.getString(4).toString()) + .hasSize(RandomGeneratorVisitor.RANDOM_STRING_LENGTH_DEFAULT); +} + +descriptor.putString( +DataGenConnectorOptionsUtil.FIELDS + ".f2." + DataGenConnectorOptionsUtil.KIND, +DataGenConnectorOptionsUtil.RANDOM); +descriptor.putLong( +DataGenConnectorOptionsUtil.FIELDS + ".f2." + DataGenConnectorOptionsUtil.LENGTH, +25); +descriptor.putString( +DataGenConnectorOptionsUtil.FIELDS + ".f4." + DataGenConnectorOptionsUtil.KIND, +DataGenConnectorOptionsUtil.RANDOM); +descriptor.putLong( +DataGenConnectorOptionsUtil.FIELDS + ".f4." + DataGenConnectorOptionsUtil.LENGTH, +); + +results = runGenerator(LENGTH_CONSTRAINED_SCHEMA, descriptor); + +for (RowData row : results) { +assertThat(row.getString(2).toString()).hasSize(25); +assertThat(row.getString(4).toString()).hasSize(); +} + +assertThatThrownBy( +() -> { +descriptor.putString( +DataGenConnectorOptionsUtil.FIELDS ++ ".f3." ++ DataGenConnectorOptionsUtil.KIND, +DataGenConnectorOptionsUtil.RANDOM); +descriptor.putLong( +DataGenConnectorOptionsUtil.FIELDS ++ ".f3." ++ DataGenConnectorOptionsUtil.LENGTH, +21); + +runGenerator(LENGTH_CONSTRAINED_SCHEMA, descriptor); +}) +.satisfies( +anyCauseMatches( +ValidationException.class, +"Custom length '21' for variable-length type (VARCHAR/STRING/VARBINARY/BYTES) field 'f3' should be shorter than '20' defined in the schema.")); +} + +@Test +void testFixedLengthDataType() throws Exception { +DescriptorProperties descriptor = new DescriptorProperties(); +final int rowsNumber = 200; +descriptor.putString(FactoryUtil.CONNECTOR.key(), "datagen"); +descriptor.putLong(DataGenConnectorOptions.NUMBER_OF_ROWS.key(), rowsNumber); + +List results = runGenerator(LENGTH_CONSTRAINED_SCHEMA, descriptor); +assertThat(results).hasSize(rowsNumber); + +for (RowData row : results) { +assertThat(row.getString(0).toString()).hasSize(50); +assertThat(row.getBinary(1)).hasSize(40); +} + +assertThatThrownBy( +() -> { +descriptor.putString( +DataGenConnectorOptionsUtil.FIELDS ++ ".f0." ++ DataGenConnectorOptionsUtil.KIND, +DataGenConnectorOptionsUtil.RANDOM); +descriptor.putLong( +DataGenConnectorOptionsUtil.FIELDS +
Re: [PR] [FLINK-32993][table] Datagen connector handles length-constrained fields according to the schema definition by default [flink]
liyubin117 commented on code in PR #23678: URL: https://github.com/apache/flink/pull/23678#discussion_r1423393030 ## flink-table/flink-table-api-java-bridge/src/test/java/org/apache/flink/table/factories/DataGenTableSourceFactoryTest.java: ## @@ -297,10 +295,104 @@ void testVariableLengthDataGeneration() throws Exception { anyCauseMatches( ValidationException.class, String.format( -"Only supports specifying '%s' option for variable-length types (varchar, string, varbinary, bytes). The type of field %s is not within this range.", +"Only supports specifying '%s' option for variable-length types (VARCHAR/STRING/VARBINARY/BYTES). The type of field '%s' is not within this range.", DataGenConnectorOptions.FIELD_VAR_LEN.key(), "f4"))); } +@Test +void testVariableLengthDataType() throws Exception { +DescriptorProperties descriptor = new DescriptorProperties(); +final int rowsNumber = 200; +descriptor.putString(FactoryUtil.CONNECTOR.key(), "datagen"); +descriptor.putLong(DataGenConnectorOptions.NUMBER_OF_ROWS.key(), rowsNumber); + +List results = runGenerator(LENGTH_CONSTRAINED_SCHEMA, descriptor); +assertThat(results).hasSize(rowsNumber); + +for (RowData row : results) { +assertThat(row.getString(2).toString()).hasSize(30); +assertThat(row.getBinary(3)).hasSize(20); +assertThat(row.getString(4).toString()) + .hasSize(RandomGeneratorVisitor.RANDOM_STRING_LENGTH_DEFAULT); +} + +descriptor.putString( +DataGenConnectorOptionsUtil.FIELDS + ".f2." + DataGenConnectorOptionsUtil.KIND, +DataGenConnectorOptionsUtil.RANDOM); +descriptor.putLong( +DataGenConnectorOptionsUtil.FIELDS + ".f2." + DataGenConnectorOptionsUtil.LENGTH, +25); +descriptor.putString( +DataGenConnectorOptionsUtil.FIELDS + ".f4." + DataGenConnectorOptionsUtil.KIND, +DataGenConnectorOptionsUtil.RANDOM); +descriptor.putLong( +DataGenConnectorOptionsUtil.FIELDS + ".f4." + DataGenConnectorOptionsUtil.LENGTH, +); + +results = runGenerator(LENGTH_CONSTRAINED_SCHEMA, descriptor); + +for (RowData row : results) { +assertThat(row.getString(2).toString()).hasSize(25); +assertThat(row.getString(4).toString()).hasSize(); +} + +assertThatThrownBy( +() -> { +descriptor.putString( +DataGenConnectorOptionsUtil.FIELDS ++ ".f3." ++ DataGenConnectorOptionsUtil.KIND, +DataGenConnectorOptionsUtil.RANDOM); +descriptor.putLong( +DataGenConnectorOptionsUtil.FIELDS ++ ".f3." ++ DataGenConnectorOptionsUtil.LENGTH, +21); + +runGenerator(LENGTH_CONSTRAINED_SCHEMA, descriptor); +}) +.satisfies( +anyCauseMatches( +ValidationException.class, +"Custom length '21' for variable-length type (VARCHAR/STRING/VARBINARY/BYTES) field 'f3' should be shorter than '20' defined in the schema.")); +} + +@Test +void testFixedLengthDataType() throws Exception { +DescriptorProperties descriptor = new DescriptorProperties(); +final int rowsNumber = 200; +descriptor.putString(FactoryUtil.CONNECTOR.key(), "datagen"); +descriptor.putLong(DataGenConnectorOptions.NUMBER_OF_ROWS.key(), rowsNumber); + +List results = runGenerator(LENGTH_CONSTRAINED_SCHEMA, descriptor); +assertThat(results).hasSize(rowsNumber); + +for (RowData row : results) { +assertThat(row.getString(0).toString()).hasSize(50); +assertThat(row.getBinary(1)).hasSize(40); +} + +assertThatThrownBy( +() -> { +descriptor.putString( +DataGenConnectorOptionsUtil.FIELDS ++ ".f0." ++ DataGenConnectorOptionsUtil.KIND, +DataGenConnectorOptionsUtil.RANDOM); +descriptor.putLong( +DataGenConnectorOptionsUtil.FIELDS +
Re: [PR] [FLINK-32993][table] Datagen connector handles length-constrained fields according to the schema definition by default [flink]
liyubin117 commented on code in PR #23678: URL: https://github.com/apache/flink/pull/23678#discussion_r1423385053 ## flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/connector/datagen/table/DataGenTableSourceFactory.java: ## @@ -141,15 +143,15 @@ private DataGeneratorContainer createContainer( } private void validateFieldOptions(String name, DataType type, ReadableConfig options) { -ConfigOption lenOption = +ConfigOption varLenOption = key(DataGenConnectorOptionsUtil.FIELDS + "." + name + "." + DataGenConnectorOptionsUtil.VAR_LEN) .booleanType() .defaultValue(false); -options.getOptional(lenOption) +options.getOptional(varLenOption) Review Comment: the prior name is not clear enough, so I use a new one instead. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-32993][table] Datagen connector handles length-constrained fields according to the schema definition by default [flink]
LadyForest commented on code in PR #23678: URL: https://github.com/apache/flink/pull/23678#discussion_r1423376727 ## flink-table/flink-table-api-java-bridge/src/test/java/org/apache/flink/table/factories/DataGenTableSourceFactoryTest.java: ## @@ -297,10 +295,104 @@ void testVariableLengthDataGeneration() throws Exception { anyCauseMatches( ValidationException.class, String.format( -"Only supports specifying '%s' option for variable-length types (varchar, string, varbinary, bytes). The type of field %s is not within this range.", +"Only supports specifying '%s' option for variable-length types (VARCHAR/STRING/VARBINARY/BYTES). The type of field '%s' is not within this range.", DataGenConnectorOptions.FIELD_VAR_LEN.key(), "f4"))); } +@Test +void testVariableLengthDataType() throws Exception { +DescriptorProperties descriptor = new DescriptorProperties(); +final int rowsNumber = 200; +descriptor.putString(FactoryUtil.CONNECTOR.key(), "datagen"); +descriptor.putLong(DataGenConnectorOptions.NUMBER_OF_ROWS.key(), rowsNumber); + +List results = runGenerator(LENGTH_CONSTRAINED_SCHEMA, descriptor); +assertThat(results).hasSize(rowsNumber); + +for (RowData row : results) { +assertThat(row.getString(2).toString()).hasSize(30); +assertThat(row.getBinary(3)).hasSize(20); +assertThat(row.getString(4).toString()) + .hasSize(RandomGeneratorVisitor.RANDOM_STRING_LENGTH_DEFAULT); +} + +descriptor.putString( +DataGenConnectorOptionsUtil.FIELDS + ".f2." + DataGenConnectorOptionsUtil.KIND, +DataGenConnectorOptionsUtil.RANDOM); +descriptor.putLong( +DataGenConnectorOptionsUtil.FIELDS + ".f2." + DataGenConnectorOptionsUtil.LENGTH, +25); +descriptor.putString( +DataGenConnectorOptionsUtil.FIELDS + ".f4." + DataGenConnectorOptionsUtil.KIND, +DataGenConnectorOptionsUtil.RANDOM); +descriptor.putLong( +DataGenConnectorOptionsUtil.FIELDS + ".f4." + DataGenConnectorOptionsUtil.LENGTH, +); + +results = runGenerator(LENGTH_CONSTRAINED_SCHEMA, descriptor); + +for (RowData row : results) { +assertThat(row.getString(2).toString()).hasSize(25); +assertThat(row.getString(4).toString()).hasSize(); +} + +assertThatThrownBy( +() -> { +descriptor.putString( +DataGenConnectorOptionsUtil.FIELDS ++ ".f3." ++ DataGenConnectorOptionsUtil.KIND, +DataGenConnectorOptionsUtil.RANDOM); +descriptor.putLong( +DataGenConnectorOptionsUtil.FIELDS ++ ".f3." ++ DataGenConnectorOptionsUtil.LENGTH, +21); + +runGenerator(LENGTH_CONSTRAINED_SCHEMA, descriptor); +}) +.satisfies( +anyCauseMatches( +ValidationException.class, +"Custom length '21' for variable-length type (VARCHAR/STRING/VARBINARY/BYTES) field 'f3' should be shorter than '20' defined in the schema.")); +} + +@Test +void testFixedLengthDataType() throws Exception { +DescriptorProperties descriptor = new DescriptorProperties(); +final int rowsNumber = 200; +descriptor.putString(FactoryUtil.CONNECTOR.key(), "datagen"); +descriptor.putLong(DataGenConnectorOptions.NUMBER_OF_ROWS.key(), rowsNumber); + +List results = runGenerator(LENGTH_CONSTRAINED_SCHEMA, descriptor); +assertThat(results).hasSize(rowsNumber); + +for (RowData row : results) { +assertThat(row.getString(0).toString()).hasSize(50); +assertThat(row.getBinary(1)).hasSize(40); +} + +assertThatThrownBy( +() -> { +descriptor.putString( +DataGenConnectorOptionsUtil.FIELDS ++ ".f0." ++ DataGenConnectorOptionsUtil.KIND, +DataGenConnectorOptionsUtil.RANDOM); +descriptor.putLong( +DataGenConnectorOptionsUtil.FIELDS +
Re: [PR] [FLINK-32993][table] Datagen connector handles length-constrained fields according to the schema definition by default [flink]
liyubin117 commented on code in PR #23678: URL: https://github.com/apache/flink/pull/23678#discussion_r1421917082 ## flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/connector/datagen/table/DataGenConnectorOptions.java: ## @@ -118,10 +118,10 @@ public class DataGenConnectorOptions { .withDescription("The proportion of null values."); /** Placeholder {@link ConfigOption}. Not used for retrieving values. */ -public static final ConfigOption FIELD_VAR_LEN = +public static final ConfigOption FIELD_VAR_LEN = ConfigOptions.key(String.format("%s.#.%s", FIELDS, VAR_LEN)) -.floatType() -.defaultValue(0f) +.booleanType() +.defaultValue(false) Review Comment: just fix typo, to make commits clear, I have divided the modifition into another commit :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-32993][table] Datagen connector handles length-constrained fields according to the schema definition by default [flink]
liyubin117 commented on code in PR #23678: URL: https://github.com/apache/flink/pull/23678#discussion_r1421917082 ## flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/connector/datagen/table/DataGenConnectorOptions.java: ## @@ -118,10 +118,10 @@ public class DataGenConnectorOptions { .withDescription("The proportion of null values."); /** Placeholder {@link ConfigOption}. Not used for retrieving values. */ -public static final ConfigOption FIELD_VAR_LEN = +public static final ConfigOption FIELD_VAR_LEN = ConfigOptions.key(String.format("%s.#.%s", FIELDS, VAR_LEN)) -.floatType() -.defaultValue(0f) +.booleanType() +.defaultValue(false) Review Comment: just to fix typo, to make commits clear, I have divided the modifition into another commit :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-32993][table] Datagen connector handles length-constrained fields according to the schema definition by default [flink]
liyubin117 commented on code in PR #23678: URL: https://github.com/apache/flink/pull/23678#discussion_r1421910481 ## flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/connector/datagen/table/RandomGeneratorVisitor.java: ## @@ -489,6 +465,16 @@ protected DataGeneratorContainer defaultMethod(LogicalType logicalType) { throw new ValidationException("Unsupported type: " + logicalType); } +private ConfigOption getLengthOption(Supplier defaultLengthSupplier) { Review Comment: wise idea ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-32993][table] Datagen connector handles length-constrained fields according to the schema definition by default [flink]
LadyForest commented on code in PR #23678: URL: https://github.com/apache/flink/pull/23678#discussion_r1421891193 ## flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/connector/datagen/table/DataGenConnectorOptions.java: ## @@ -118,10 +118,10 @@ public class DataGenConnectorOptions { .withDescription("The proportion of null values."); /** Placeholder {@link ConfigOption}. Not used for retrieving values. */ -public static final ConfigOption FIELD_VAR_LEN = +public static final ConfigOption FIELD_VAR_LEN = ConfigOptions.key(String.format("%s.#.%s", FIELDS, VAR_LEN)) -.floatType() -.defaultValue(0f) +.booleanType() +.defaultValue(false) Review Comment: I'm a little bit confused. Why change this config type? ## flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/connector/datagen/table/RandomGeneratorVisitor.java: ## @@ -489,6 +465,16 @@ protected DataGeneratorContainer defaultMethod(LogicalType logicalType) { throw new ValidationException("Unsupported type: " + logicalType); } +private ConfigOption getLengthOption(Supplier defaultLengthSupplier) { Review Comment: `ArrayType`, `MultisetType,` and `MapType` can also use this method. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-32993][table] Datagen connector handles length-constrained fields according to the schema definition by default [flink]
liyubin117 commented on PR #23678: URL: https://github.com/apache/flink/pull/23678#issuecomment-1839898221 @LadyForest I have updated commits and rebased on the latest branch, now CI passed, PLAL, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-32993][table] Datagen connector handles length-constrained fields according to the schema definition by default [flink]
liyubin117 commented on PR #23678: URL: https://github.com/apache/flink/pull/23678#issuecomment-1835872939 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-32993][table] Datagen connector handles length-constrained fields according to the schema definition by default [flink]
liyubin117 commented on PR #23678: URL: https://github.com/apache/flink/pull/23678#issuecomment-1835806974 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-32993][table] Datagen connector handles length-constrained fields according to the schema definition by default [flink]
liyubin117 commented on code in PR #23678: URL: https://github.com/apache/flink/pull/23678#discussion_r1411842995 ## flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/connector/datagen/table/DataGenTableSourceFactory.java: ## @@ -127,11 +132,53 @@ private DataGeneratorContainer createContainer( String name, DataType type, String kind, ReadableConfig options) { switch (kind) { case DataGenConnectorOptionsUtil.RANDOM: +validateFieldOptions(name, type, options); return type.getLogicalType().accept(new RandomGeneratorVisitor(name, options)); case DataGenConnectorOptionsUtil.SEQUENCE: return type.getLogicalType().accept(new SequenceGeneratorVisitor(name, options)); default: throw new ValidationException("Unsupported generator kind: " + kind); } } + +private void validateFieldOptions(String name, DataType type, ReadableConfig options) { +ConfigOption lenOption = +key(DataGenConnectorOptionsUtil.FIELDS ++ "." ++ name ++ "." ++ DataGenConnectorOptionsUtil.LENGTH) +.intType() +.noDefaultValue(); +options.getOptional(lenOption) +.ifPresent( +option -> { +LogicalType logicalType = type.getLogicalType(); +if (logicalType instanceof CharType +|| logicalType instanceof BinaryType) { +throw new ValidationException( +String.format( +"User-defined length of the fixed-length field %s is not supported.", +name)); +} +if (logicalType instanceof VarCharType) { +int length = ((VarCharType) logicalType).getLength(); +if (option > length) { +throw new ValidationException( +String.format( +"User-defined length of the VARCHAR field %s should be shorter than the schema definition.", +name)); +} +} +if (logicalType instanceof VarBinaryType) { +int length = ((VarBinaryType) logicalType).getLength(); +if (option > length) { +throw new ValidationException( +String.format( +"User-defined length of the VARBINARY field %s should be shorter than the schema definition.", +name)); +} +} +}); +} Review Comment: done ## flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/connector/datagen/table/DataGenTableSourceFactory.java: ## @@ -127,11 +132,53 @@ private DataGeneratorContainer createContainer( String name, DataType type, String kind, ReadableConfig options) { switch (kind) { case DataGenConnectorOptionsUtil.RANDOM: +validateFieldOptions(name, type, options); return type.getLogicalType().accept(new RandomGeneratorVisitor(name, options)); case DataGenConnectorOptionsUtil.SEQUENCE: return type.getLogicalType().accept(new SequenceGeneratorVisitor(name, options)); default: throw new ValidationException("Unsupported generator kind: " + kind); } } + +private void validateFieldOptions(String name, DataType type, ReadableConfig options) { +ConfigOption lenOption = +key(DataGenConnectorOptionsUtil.FIELDS ++ "." ++ name ++ "." ++ DataGenConnectorOptionsUtil.LENGTH) +.intType() +.noDefaultValue(); +options.getOptional(lenOption) +.ifPresent( +option -> { +LogicalType logicalType = type.getLogicalType(); +if (logicalType instanceof CharType +|| logicalType instanceof BinaryType) { +throw new
Re: [PR] [FLINK-32993][table] Datagen connector handles length-constrained fields according to the schema definition by default [flink]
liyubin117 commented on code in PR #23678: URL: https://github.com/apache/flink/pull/23678#discussion_r1411839904 ## flink-table/flink-table-api-java-bridge/src/test/java/org/apache/flink/table/factories/DataGenTableSourceFactoryTest.java: ## @@ -215,6 +216,100 @@ void testSource() throws Exception { } } +@Test +void testVariableLengthDataType() throws Exception { +DescriptorProperties descriptor = new DescriptorProperties(); Review Comment: Thanks for your reminds, I will create another issue to follow up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-32993][table] Datagen connector handles length-constrained fields according to the schema definition by default [flink]
liyubin117 commented on code in PR #23678: URL: https://github.com/apache/flink/pull/23678#discussion_r1411815025 ## flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/connector/datagen/table/RandomGeneratorVisitor.java: ## @@ -152,25 +142,20 @@ public DataGeneratorContainer visit(VarCharType varCharType) { + "." + DataGenConnectorOptionsUtil.LENGTH) .intType() -.defaultValue(RANDOM_STRING_LENGTH_DEFAULT); +.defaultValue(varCharType.getLength()); Review Comment: good idea! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-32993][table] Datagen connector handles length-constrained fields according to the schema definition by default [flink]
LadyForest commented on code in PR #23678: URL: https://github.com/apache/flink/pull/23678#discussion_r1410315513 ## flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/connector/datagen/table/DataGenTableSourceFactory.java: ## @@ -127,11 +132,53 @@ private DataGeneratorContainer createContainer( String name, DataType type, String kind, ReadableConfig options) { switch (kind) { case DataGenConnectorOptionsUtil.RANDOM: +validateFieldOptions(name, type, options); return type.getLogicalType().accept(new RandomGeneratorVisitor(name, options)); case DataGenConnectorOptionsUtil.SEQUENCE: return type.getLogicalType().accept(new SequenceGeneratorVisitor(name, options)); default: throw new ValidationException("Unsupported generator kind: " + kind); } } + +private void validateFieldOptions(String name, DataType type, ReadableConfig options) { +ConfigOption lenOption = +key(DataGenConnectorOptionsUtil.FIELDS ++ "." ++ name ++ "." ++ DataGenConnectorOptionsUtil.LENGTH) +.intType() +.noDefaultValue(); +options.getOptional(lenOption) +.ifPresent( +option -> { +LogicalType logicalType = type.getLogicalType(); +if (logicalType instanceof CharType +|| logicalType instanceof BinaryType) { +throw new ValidationException( +String.format( +"User-defined length of the fixed-length field %s is not supported.", Review Comment: Nit: Custom length for fixed-length type field '%s' is not supported. ## flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/connector/datagen/table/DataGenTableSourceFactory.java: ## @@ -127,11 +132,53 @@ private DataGeneratorContainer createContainer( String name, DataType type, String kind, ReadableConfig options) { switch (kind) { case DataGenConnectorOptionsUtil.RANDOM: +validateFieldOptions(name, type, options); return type.getLogicalType().accept(new RandomGeneratorVisitor(name, options)); case DataGenConnectorOptionsUtil.SEQUENCE: return type.getLogicalType().accept(new SequenceGeneratorVisitor(name, options)); default: throw new ValidationException("Unsupported generator kind: " + kind); } } + +private void validateFieldOptions(String name, DataType type, ReadableConfig options) { +ConfigOption lenOption = +key(DataGenConnectorOptionsUtil.FIELDS ++ "." ++ name ++ "." ++ DataGenConnectorOptionsUtil.LENGTH) +.intType() +.noDefaultValue(); +options.getOptional(lenOption) +.ifPresent( +option -> { +LogicalType logicalType = type.getLogicalType(); +if (logicalType instanceof CharType +|| logicalType instanceof BinaryType) { +throw new ValidationException( +String.format( +"User-defined length of the fixed-length field %s is not supported.", +name)); +} +if (logicalType instanceof VarCharType) { +int length = ((VarCharType) logicalType).getLength(); +if (option > length) { +throw new ValidationException( +String.format( +"User-defined length of the VARCHAR field %s should be shorter than the schema definition.", +name)); +} +} +if (logicalType instanceof VarBinaryType) { +int length = ((VarBinaryType) logicalType).getLength(); +if (option > length) { +throw new ValidationException( +String.format( +
Re: [PR] [FLINK-32993][table] Datagen connector handles length-constrained fields according to the schema definition by default [flink]
liyubin117 commented on code in PR #23678: URL: https://github.com/apache/flink/pull/23678#discussion_r1404708247 ## flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/connector/datagen/table/RandomGeneratorVisitor.java: ## @@ -135,7 +135,7 @@ public DataGeneratorContainer visit(CharType charType) { + "." + DataGenConnectorOptionsUtil.LENGTH) .intType() -.defaultValue(RANDOM_STRING_LENGTH_DEFAULT); +.defaultValue(charType.getLength()); Review Comment: done :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org