[jira] [Commented] (FLUME-2070) Add a Flume Morphline Solr Sink

wolfgang hoschek (JIRA) Sat, 15 Jun 2013 00:58:24 -0700

    [ 
https://issues.apache.org/jira/browse/FLUME-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13684105#comment-13684105
 ]


wolfgang hoschek commented on FLUME-2070:
-----------------------------------------

bq. This need not be synchronized, since only one thread - the sink runner will 
call process on a particular sink

thanks - this is now fixed

bq. This counter should be incremented every time indexer.process() is called 
(though not all sinks even try to update this counter}

I looked at HDFSEventSink and it behaves exactly like the MorphlineSink in this 
regard

bq. We should log the exceptions if the rollbacks (for the indexer or the 
channel failed})

thanks - this is now fixed

bq. Why not use a LinkedBlockingDeque or something like that here

thanks - this is now fixed

bq. Also we might want to make sure we only spawn a maximum number of the 
LocalMorphlineInterceptors?

The maximum number of the LocalMorphlineInterceptors won't get bigger than the 
number of concurrent threads, so this seems fine to me without an explicit limit

bq. This array can be pre-allocated and reused no?

I'm not sure it can be reused in a thread safe way and if thread safety is 
required here.

bq. This code seems to be repeated in BlobHandler and BlobDeserializer, maybe 
move it to a util class or something?

I'd rather keep it separate for the time being. For examnple, at some point you 
guys might decide to pull one of the two classes up into a higher package and 
in this case it's easier to not have the code entangled.

bq. Also, could you please rebase on trunk?

yep, i've attached a new patch

bq. Also, we should see if we can fix the build issue. 

I don't know how to fix this one. I tried un-commenting the dependency in 
flume-ng-dist/pom.xml but for some reason it doesn't build like that:

[/cloud/repos/flume (trunk *+)] mvn clean test 
[INFO] Scanning for projects...
[ERROR] The build could not read 1 project -> [Help 1]
[ERROR]   
[ERROR]   The project org.apache.flume:flume-ng-dist:1.4.0-SNAPSHOT 
(/Users/hoschek/unix/cloud/repos/flume/flume-ng-dist/pom.xml) has 1 error
[ERROR]     'dependencies.dependency.version' for 
org.apache.flume.flume-ng-sinks:flume-ng-morphline-solr-sink:jar is missing. @ 
line 115, column 17
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException

When I comment out the dependency in flume-ng-dist/pom.xml it builds fine. But 
then the dist is incomplete, of course. I believe it might work if you run it 
once commented out and produce maven artefacts this way, then un-comment the 
dependency and rerun and now it might work.

                
> Add a Flume Morphline Solr Sink
> -------------------------------
>
>                 Key: FLUME-2070
>                 URL: https://issues.apache.org/jira/browse/FLUME-2070
>             Project: Flume
>          Issue Type: Improvement
>          Components: Sinks+Sources
>    Affects Versions: v1.3.1
>            Reporter: wolfgang hoschek
>             Fix For: v1.4.0
>
>         Attachments: FLUME-2070-v1.patch, FLUME-2070-v2.patch, 
> FLUME-2070-v3.patch
>
>
> Add a Flume Morphline Solr Sink that extracts search documents from Flume 
> events, transforms them with a morphline and loads them in Near Real Time 
> into Apache Solr, typically a SolrCloud. 
> The sink is intended to be used alongside the HdfsSink. It is designed to 
> extract, transform and load any data in flexible ways, not just structured 
> data, but also arbitrary raw data, including data from many heterogeneous 
> data sources.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (FLUME-2070) Add a Flume Morphline Solr Sink

Reply via email to