wangxiaobaidu11 edited a comment on pull request #10920:
URL: https://github.com/apache/druid/pull/10920#issuecomment-990912600
> @wangxiaobaidu11 you don't need to make changes to the druid spark code
for your use case - you can call
`AggregatorFactoryRegistry.register("longUnique", new
LongUniqueAggregatorFactory("", "", 0)` from within your own spark app. That's
definitely still ugly since the AggregatorFactory instance is unnecessary, but
as mentioned in my previous comment this won't be the case for long. If
instantiating an instance is a problem, there is one other temporary
work-around: because all `AggregatorFactoryRegistry` does under the hood is
register subtypes, you can use the public package method `registerSubType`. In
your case, you would call `org.apache.druid.spark.registerSubtype(new
NamedType(classOf[LongUniqueAggregatorFactory], "longUnique"))` from your spark
app. (You can statically that method if you'd like, leaving just
`registerSubtype(...)`)
Thanks!I will update it . I have another question.
①when i set:

②spark runtime info:
`21/12/10 16:09:45 INFO DruidDataSourceWriter: Committing the following
segments: DataSegment{binaryVersion=9,
id=test_spark_druid_cube_v4_2020-01-01T00:00:00.000Z_2020-01-02T00:00:00.000Z_2021-12-10T08:09:41.601Z,
loadSpec={type=>hdfs,
path=>hdfs://xxxx/xxxx/xxx/xxxxxx/segments/test_spark_druid_cube_v4/20200101T000000.000Z_20200102T000000.000Z/2021-12-10T08_09_41.601Z/3_5427a1c2-6405-4516-83b1-2dd17bfff433_index.zip},
dimensions=[dim1, dim2, id1, id2], metrics=[count, sum_metric1, sum_metric2,
sum_metric3, sum_metric4, uniq_id1_unique],
shardSpec=NumberedShardSpec{partitionNum=0, partitions=1},
lastCompactionState=null, size=3390}, DataSegment{binaryVersion=9,
id=test_spark_druid_cube_v4_2020-01-01T00:00:00.000Z_2020-01-02T00:00:00.000Z_2021-12-10T08:09:41.443Z,
loadSpec={type=>hdfs,
path=>hdfs://xxxx/xxxx/xxx/xxxxxx/segments/test_spark_druid_cube_v4/20200101T000000.000Z_20200102T000000.000Z/2021-12-10T08_09_41.443Z/1_c5be6b7e-76a6-44dd-9c53-f189950cb54d_index.zip},
dimensions=[d
im1, dim2, id1, id2], metrics=[count, sum_metric1, sum_metric2, sum_metric3,
sum_metric4, uniq_id1_unique], shardSpec=NumberedShardSpec{partitionNum=0,
partitions=1}, lastCompactionState=null, size=3390},
DataSegment{binaryVersion=9,
id=test_spark_druid_cube_v4_2020-01-01T00:00:00.000Z_2020-01-02T00:00:00.000Z_2021-12-10T08:09:41.767Z,
loadSpec={type=>hdfs,
path=>hdfs://xxxx/xxxx/xxx/xxxxxx/segments/test_spark_druid_cube_v4/20200101T000000.000Z_20200102T000000.000Z/2021-12-10T08_09_41.767Z/2_04fc7ed4-4131-4856-b60d-95c7b409251c_index.zip},
dimensions=[dim1, dim2, id1, id2], metrics=[count, sum_metric1, sum_metric2,
sum_metric3, sum_metric4, uniq_id1_unique],
shardSpec=NumberedShardSpec{partitionNum=0, partitions=1},
lastCompactionState=null, size=3390}, DataSegment{binaryVersion=9,
id=test_spark_druid_cube_v4_2020-01-01T00:00:00.000Z_2020-01-02T00:00:00.000Z_2021-12-10T08:09:41.336Z,
loadSpec={type=>hdfs,
path=>hdfs://xxxx/xxxx/xxx/xxxxxx/segments/test_spark_druid_cube_v4/20200101T0
00000.000Z_20200102T000000.000Z/2021-12-10T08_09_41.336Z/0_918c926a-5738-4a19-a58b-8a3024ee01ad_index.zip},
dimensions=[dim1, dim2, id1, id2], metrics=[count, sum_metric1, sum_metric2,
sum_metric3, sum_metric4, uniq_id1_unique],
shardSpec=NumberedShardSpec{partitionNum=0, partitions=1},
lastCompactionState=null, size=3390}, DataSegment{binaryVersion=9,
id=test_spark_druid_cube_v4_2020-01-02T00:00:00.000Z_2020-01-03T00:00:00.000Z_2021-12-10T08:09:41.299Z,
loadSpec={type=>hdfs,
path=>hdfs://xxxx/xxxx/xxx/xxxxxx/segments/test_spark_druid_cube_v4/20200102T000000.000Z_20200103T000000.000Z/2021-12-10T08_09_41.299Z/5_c62528cf-4377-4fa4-98da-df09dcc8e359_index.zip},
dimensions=[dim1, dim2, id1, id2], metrics=[count, sum_metric1, sum_metric2,
sum_metric3, sum_metric4, uniq_id1_unique],
shardSpec=NumberedShardSpec{partitionNum=0, partitions=1},
lastCompactionState=null, size=3390}, DataSegment{binaryVersion=9,
id=test_spark_druid_cube_v4_2020-01-02T00:00:00.000Z_2020-01-03T00:00:00.000Z_2021-
12-10T08:09:41.835Z, loadSpec={type=>hdfs,
path=>hdfs://xxxx/xxxx/xxx/xxxxxx/segments/test_spark_druid_cube_v4/20200102T000000.000Z_20200103T000000.000Z/2021-12-10T08_09_41.835Z/4_176e1242-cb5f-4745-bae3-0e9bb47b6c62_index.zip},
dimensions=[dim1, dim2, id1, id2], metrics=[count, sum_metric1, sum_metric2,
sum_metric3, sum_metric4, uniq_id1_unique],
shardSpec=NumberedShardSpec{partitionNum=0, partitions=1},
lastCompactionState=null, size=3466}
21/12/10 16:09:45 INFO SQLMetadataStorageUpdaterJobHandler: Published
test_spark_druid_cube_v4_2020-01-01T00:00:00.000Z_2020-01-02T00:00:00.000Z_2021-12-10T08:09:41.601Z
21/12/10 16:09:45 INFO SQLMetadataStorageUpdaterJobHandler: Published
test_spark_druid_cube_v4_2020-01-01T00:00:00.000Z_2020-01-02T00:00:00.000Z_2021-12-10T08:09:41.443Z
21/12/10 16:09:45 INFO SQLMetadataStorageUpdaterJobHandler: Published
test_spark_druid_cube_v4_2020-01-01T00:00:00.000Z_2020-01-02T00:00:00.000Z_2021-12-10T08:09:41.767Z
21/12/10 16:09:45 INFO SQLMetadataStorageUpdaterJobHandler: Published
test_spark_druid_cube_v4_2020-01-01T00:00:00.000Z_2020-01-02T00:00:00.000Z_2021-12-10T08:09:41.336Z
21/12/10 16:09:45 INFO SQLMetadataStorageUpdaterJobHandler: Published
test_spark_druid_cube_v4_2020-01-02T00:00:00.000Z_2020-01-03T00:00:00.000Z_2021-12-10T08:09:41.299Z
21/12/10 16:09:45 INFO SQLMetadataStorageUpdaterJobHandler: Published
test_spark_druid_cube_v4_2020-01-02T00:00:00.000Z_2020-01-03T00:00:00.000Z_2021-12-10T08:09:41.835Z
21/12/10 16:09:45 INFO WriteToDataSourceV2Exec: Data source writer
org.apache.druid.spark.v2.writer.DruidDataSourceWriter@1d1c63af committed.`
③ the same date is covered,but I didn't want that to happen
`21/12/10 16:09:45 WARN SegmentRationalizer: More than one version detected
for interval 2020-01-01T00:00:00.000Z/2020-01-02T00:00:00.000Z on dataSource
test_spark_druid_cube_v4! Some segments will be overshadowed!
21/12/10 16:09:45 WARN SegmentRationalizer: More than one version detected
for interval 2020-01-02T00:00:00.000Z/2020-01-03T00:00:00.000Z on dataSource
test_spark_druid_cube_v4! Some segments will be overshadowed!`

④ I expect result which is combined segments,How do I set partition
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]