Oh that's valid question, I forgot to mention that when HDFS is brought up
to date (i.e. the spout coordinator time position has reached current
time), the storm job will no longer emit 1-hour slices, but it will rather
emit in real-time any new data that has just been inserted into oracle. It
will try to batch them in 5-second slices though, so at that point it
becomes real-time streaming to HDFS. I'm not sure if that's something that
can via sqoop? At some point, we'll be streaming to a web service endpoint
as well, which I think is not doable via sqoop.

Thanks



On Mon, Nov 3, 2014 at 1:15 PM, Babu, Prashanth <[email protected]>
wrote:

>  Any specific reason for using Storm for this use case and not Sqoop?
>
>
>
> *From:* Aiman Najjar [mailto:[email protected]]
> *Sent:* 03 November 2014 15:25
> *To:* [email protected]
> *Subject:* Storm Trident use case
>
>
>
> Hello,
>
>
>
> I'm new to storm, I'm trying to build a trident topology that exports rows
> from an oracle db to hdfs. I'm using an existing implementation for
> HdfsState
>
>
>
> I wrote my own trident spout that emits tuples in 1-hour time slices (so
> the coordinator metadata are timestamps), so for instance, if the time now
> is 5:30pm, and the current position of the spout coordinator is 12pm, the
> spout will emit 5 batches of tuples.
>
>
>
> My problem is that I'm not sure if I'm taking advantage of storm's
> distributed processing, will those 5 batches be processed in parallel? Is
> my design optimal? Here's how my topology looks like:
>
>
>
>  topology.*newStream*(tableName + "_AUDIT_STREAM", auditSpout)        .
> *partitionPersist*(factory, tableFields, *new* *HdfsUpdater*(), *new*
> *Fields*());
>
>
>
>
>
> Thanks
>
> ______________________________________________________________________
> Disclaimer: This email and any attachments are sent in strictest confidence
> for the sole use of the addressee and may contain legally privileged,
> confidential, and proprietary data. If you are not the intended recipient,
> please advise the sender by replying promptly to this email and then delete
> and destroy this email and any attachments without any further use, copying
> or forwarding.
>

Reply via email to