jihoonson commented on a change in pull request #9171: Doc update for the new 
input source and the new input format
URL: https://github.com/apache/druid/pull/9171#discussion_r367707404
 
 

 ##########
 File path: docs/development/modules.md
 ##########
 @@ -148,29 +150,43 @@ To start a segment killing task, you need to access the 
old Coordinator console
 
 After the killing task ends, `index.zip` (`partitionNum_index.zip` for HDFS 
data storage) file should be deleted from the data storage.
 
-### Adding a new Firehose
+### Adding support for a new input source
 
-There is an example of this in the `s3-extensions` module with the 
StaticS3FirehoseFactory.
+Adding support for a new input source requires to implement three interfaces, 
i.e., `InputSource`, `InputEntity`, and `InputSourceReader`.
+`InputSource` is to define where the input data is stored. `InputEntity` is to 
define how data can be read in parallel
+in [native parallel indexing](../ingestion/native-batch.md).
+`InputSourceReader` defines how to read your new input source and you can 
simply use the provided `InputEntityIteratingReader` in most cases.
 
-Adding a Firehose is done almost entirely through the Jackson Modules instead 
of Guice.  Specifically, note the implementation
+There is an example of this in the `druid-s3-extensions` module with the 
`S3InputSource` and `S3Entity`.
+
+Adding an InputSource is done almost entirely through the Jackson Modules 
instead of Guice. Specifically, note the implementation
 
 ``` java
 @Override
 public List<? extends Module> getJacksonModules()
 {
   return ImmutableList.of(
-          new SimpleModule().registerSubtypes(new 
NamedType(StaticS3FirehoseFactory.class, "static-s3"))
+          new SimpleModule().registerSubtypes(new 
NamedType(S3InputSource.class, "s3"))
   );
 }
 ```
 
-This is registering the FirehoseFactory with Jackson's polymorphic 
serialization/deserialization layer.  More concretely, having this will mean 
that if you specify a `"firehose": { "type": "static-s3", ... }` in your 
realtime config, then the system will load this FirehoseFactory for your 
firehose.
+This is registering the InputSource with Jackson's polymorphic 
serialization/deserialization layer.  More concretely, having this will mean 
that if you specify a `"inputSource": { "type": "s3", ... }` in your IO config, 
then the system will load this InputSource for your `InputSource` 
implementation.
+
+Note that inside of Druid, we have made the @JacksonInject annotation for 
Jackson deserialized objects actually use the base Guice injector to resolve 
the object to be injected.  So, if your InputSource needs access to some 
object, you can add a @JacksonInject annotation on a setter and it will get set 
on instantiation.
 
 Review comment:
   Added.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to