[
https://issues.apache.org/jira/browse/FLUME-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13684105#comment-13684105
]
wolfgang hoschek commented on FLUME-2070:
-----------------------------------------
bq. This need not be synchronized, since only one thread - the sink runner will
call process on a particular sink
thanks - this is now fixed
bq. This counter should be incremented every time indexer.process() is called
(though not all sinks even try to update this counter}
I looked at HDFSEventSink and it behaves exactly like the MorphlineSink in this
regard
bq. We should log the exceptions if the rollbacks (for the indexer or the
channel failed})
thanks - this is now fixed
bq. Why not use a LinkedBlockingDeque or something like that here
thanks - this is now fixed
bq. Also we might want to make sure we only spawn a maximum number of the
LocalMorphlineInterceptors?
The maximum number of the LocalMorphlineInterceptors won't get bigger than the
number of concurrent threads, so this seems fine to me without an explicit limit
bq. This array can be pre-allocated and reused no?
I'm not sure it can be reused in a thread safe way and if thread safety is
required here.
bq. This code seems to be repeated in BlobHandler and BlobDeserializer, maybe
move it to a util class or something?
I'd rather keep it separate for the time being. For examnple, at some point you
guys might decide to pull one of the two classes up into a higher package and
in this case it's easier to not have the code entangled.
bq. Also, could you please rebase on trunk?
yep, i've attached a new patch
bq. Also, we should see if we can fix the build issue.
I don't know how to fix this one. I tried un-commenting the dependency in
flume-ng-dist/pom.xml but for some reason it doesn't build like that:
[/cloud/repos/flume (trunk *+)] mvn clean test
[INFO] Scanning for projects...
[ERROR] The build could not read 1 project -> [Help 1]
[ERROR]
[ERROR] The project org.apache.flume:flume-ng-dist:1.4.0-SNAPSHOT
(/Users/hoschek/unix/cloud/repos/flume/flume-ng-dist/pom.xml) has 1 error
[ERROR] 'dependencies.dependency.version' for
org.apache.flume.flume-ng-sinks:flume-ng-morphline-solr-sink:jar is missing. @
line 115, column 17
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please
read the following articles:
[ERROR] [Help 1]
http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException
When I comment out the dependency in flume-ng-dist/pom.xml it builds fine. But
then the dist is incomplete, of course. I believe it might work if you run it
once commented out and produce maven artefacts this way, then un-comment the
dependency and rerun and now it might work.
> Add a Flume Morphline Solr Sink
> -------------------------------
>
> Key: FLUME-2070
> URL: https://issues.apache.org/jira/browse/FLUME-2070
> Project: Flume
> Issue Type: Improvement
> Components: Sinks+Sources
> Affects Versions: v1.3.1
> Reporter: wolfgang hoschek
> Fix For: v1.4.0
>
> Attachments: FLUME-2070-v1.patch, FLUME-2070-v2.patch,
> FLUME-2070-v3.patch
>
>
> Add a Flume Morphline Solr Sink that extracts search documents from Flume
> events, transforms them with a morphline and loads them in Near Real Time
> into Apache Solr, typically a SolrCloud.
> The sink is intended to be used alongside the HdfsSink. It is designed to
> extract, transform and load any data in flexible ways, not just structured
> data, but also arbitrary raw data, including data from many heterogeneous
> data sources.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira