imply-cheddar commented on a change in pull request #12250:
URL: https://github.com/apache/druid/pull/12250#discussion_r809660701
##########
File path:
processing/src/main/java/org/apache/druid/segment/serde/ColumnPartSerde.java
##########
@@ -39,13 +41,23 @@
@JsonSubTypes.Type(name = "floatV2", value =
FloatNumericColumnPartSerdeV2.class),
@JsonSubTypes.Type(name = "longV2", value =
LongNumericColumnPartSerdeV2.class),
@JsonSubTypes.Type(name = "doubleV2", value =
DoubleNumericColumnPartSerdeV2.class),
+ @JsonSubTypes.Type(name = "null", value = NullColumnPartSerde.class)
})
public interface ColumnPartSerde
{
@Nullable
Serializer getSerializer();
- Deserializer getDeserializer();
+ /**
+ * Returns a Deserializer to read a column from a segment.
+ *
+ * @param rowCountSupplier a supplier that returns the row count of
the segment that the column belongs to.
+ * Getting from the supplier the row count
can be expensive and thus should be
+ * evaluated lazily.
+ * @param segmentBitmapSerdeFactory bitmapSerdeFactory stored in the
segment. Each columnPartSerde may have their own
+ * bitmapSerdeFactory.
+ */
+ Deserializer getDeserializer(IntSupplier rowCountSupplier,
BitmapSerdeFactory segmentBitmapSerdeFactory);
Review comment:
I think I'd side on the thinking that this PR shouldn't be changing this
interface. It is perhaps sub-optimal that we are relying on Jackson to inject
in the factory which means that we likely are storing a bunch of extra
references (and maybe even allocating a bunch of extra objects) than we really
should. This interface change as done in this PR is trying to make life
"better" for the null column, but doesn't actually fix it for all of the other
columns. Even when we do that fix, I'd still advocate for us figuring out a
way to fix it through the constructors of the ColumnPartSerde objects instead
of by changing the interface.
As it stands, knowing the number of rows is definitely something that should
be done via the constructor. When this is persisted, it knows the number of
rows, it can store that as part of the constructor and doesn't need an extra
parameter on the interface method.
For the BitmapSerdeFactory, an argument could be made for things to go in
various different ways. I *think* that we might be getting to the point where
we actually want a Factory involved in here that can be given things like
metadata about the segment as a whole. But, that's just random musings. I'd
suggest that we keep to the current model for now and queue up a change to how
these factories are dealt with at a later stage.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]