changquanyou edited a comment on issue #2065:
URL: https://github.com/apache/iceberg/issues/2065#issuecomment-758373894
@kbendick thanks for your reply, I will explain in detail:
- Table Schema:
`CREATE TABLE mf_data (
urlHash STRING,
titleSimHash STRING,
docId STRING,
title STRING,
content STRING,
pubTime LONG,
pubDayStr STRING
) using iceberg PARTITIONED BY (pubDayStr);`
The Service assurance that **partition key** is not empty. the title
may be empty.
- Data Source:
Consume from Kafka Topic, And sink to IceBerg Table. At once, the data is
normal。
- Experiments:
1. use like query on the **title** field
`select title from mf_data where pubDayStr = '2021-01-09' and title
like '喜讯%' limit 1;`
**java.lang.IllegalArgumentException**: Truncate length should be
positive occurred
2. use like query on the **content** field:
`select title from mf_data where pubDayStr = '2021-01-09' and content
like '喜讯%' limit 1;`
The result is normal, None Exception occurred
3. use Spark SQL client insert:
`insert into mf_data
values('7a1ab0138b535572ff346801c8b61ec0','778ec08c0e9930a64fa3ea03de1334aa','bfd_c68722c6-2224-41f8-82bd-675f0b81f0cd','房东整租霍营小区二层两居室','房东整租霍营小区二层两居室',1607912994000,'2020-12-14');`
`select title from mf_data where pubDayStr = '2020-12-14' and content like
'房东%' limit 1;`
The above result is normal, of course I Insert another data(**the title is
empty**) :
`insert into mf_data
values('testb0138b535572ff346801c8b61ec0','test08c0e9930a64fa3ea03de1334aa','bfd_c68722c6-2224-41f8-82bd-675f0b81test1','','Test房东整租霍营小区二层两居室',1607912994001,'2020-12-14');`
Use The Same Spark Sql Query, **Truncate length should be positive** is
occurred.
- TODO:
Due to the huge table, It‘s not easy to confirm the empty title exists.
then I will append the title length flag to the table.
From the Experiment Result, if the field size is zero, It does happen
IllegalArgumentException. I hope that will help you.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]