Blocking is the antithesis to scaling and high performance.

    Mason Yu Jr.
    Principal Architect
    Big Data Architects, LLC.

著名的孫子

On Fri, May 15, 2015 at 9:54 AM, Fan Jiang <[email protected]> wrote:

> Yes, Enno is right about JDBC. Because JDBC is blocking in nature and JDBC
> operations could be frequently performed when you are working on a RDBMS in
> Java, limiting them will potentially improve the topology's throughput.
>
> Fan
>
> 2015-05-15 9:32 GMT-04:00 Enno Shioji <[email protected]>:
>
> JDBC drivers have no facility to make asynchronous requests, so the thread
>> that's calling it has to wait until the IO call finishes, before doing
>> anything else. This can be wasteful if there is useful work that could have
>> been done in the mean time.
>>
>> Especially in case of storm, the thread that calls the tasks can be
>> shared by multiple tasks (depending on the configuration), in which case
>> there is *probably* useful work that can be done which can't be, because
>> the thread is "blocked".
>>
>> This is not specific to JDBC. Also it's not obvious if you are better off
>> by not blocking; e.g. if there is no work that can be done with the thread
>> anyways, you can end up decreasing the overall performance with the
>> additional overhead.
>>
>> On Fri, May 15, 2015 at 1:56 PM, Jeffery Maass <[email protected]> wrote:
>>
>>> Fan:
>>>
>>> Why are you singling out JDBC operations to avoid?  What is it about
>>> them that is especially "blocking"?
>>>
>>> Thank you for your time!
>>>
>>> +++++++++++++++++++++
>>> Jeff Maass <[email protected]>
>>> linkedin.com/in/jeffmaass
>>> stackoverflow.com/users/373418/maassql
>>> +++++++++++++++++++++
>>>
>>>
>>> On Thu, May 14, 2015 at 9:41 AM, Fan Jiang <[email protected]> wrote:
>>>
>>>> One thing to note is that you should try to avoid JDBC operations in a
>>>> bolt, as they may block the bolt and affect the topology's performance. Try
>>>> to do the database access asynchronously, or create a separate thread for
>>>> JDBC operations.
>>>>
>>>> 2015-05-14 10:30 GMT-04:00 Mason Yu <[email protected]>:
>>>>
>>>> Interesting.....  Hibernate hooks inside a J2ee container or Spring
>>>>> which requires a specific OR mapping to a 20th century RDBMS.
>>>>> Storm works in a Linux distributed environment which does not
>>>>> need a RDBMS.  RDBMS's do not work in a distributed environment.
>>>>>
>>>>> Mason Yu Jr.
>>>>> CEO
>>>>> Big Data Architects, LLC.
>>>>>
>>>>> 著名的孫子
>>>>>
>>>>> On Thu, May 14, 2015 at 9:58 AM, Stephen Powis <[email protected]>
>>>>> wrote:
>>>>>
>>>>>>  [image: Boxbe] <https://www.boxbe.com/overview> This message is
>>>>>> eligible for Automatic Cleanup! ([email protected]) Add cleanup
>>>>>> rule
>>>>>> <https://www.boxbe.com/popup?url=https%3A%2F%2Fwww.boxbe.com%2Fcleanup%3Ftoken%3DuAW1cNLhRjzzoTybJZlWM4edzt3m9fQiQ%252Fotr%252BLEu3ac0GIlaQyl%252Be4UagkWlTiCY%252Bvq8KXOkzkzNY0pSkyJzvKKJyQv%252BXceuaA%252FuExYRw6YS1o2s1%252FImPAjQkHSXt%252FvWesPubbzFPmMWCDCtBIJEA%253D%253D%26key%3D%252BXRs6Dx5fQJ4FB57cniXG9YH1MKQnFQnIVYEqegbWGo%253D&tc_serial=21328751243&tc_rand=1774350433&utm_source=stf&utm_medium=email&utm_campaign=ANNO_CLEANUP_ADD&utm_content=001>
>>>>>> | More info
>>>>>> <http://blog.boxbe.com/general/boxbe-automatic-cleanup?tc_serial=21328751243&tc_rand=1774350433&utm_source=stf&utm_medium=email&utm_campaign=ANNO_CLEANUP_ADD&utm_content=001>
>>>>>>
>>>>>> Hello everyone!
>>>>>>
>>>>>> I'm currently toying around with a prototype built ontop of Storm and
>>>>>> have been running into some not so easy going while trying to work with
>>>>>> Hibernate and storm.  I was hoping to get input on if this is just a case
>>>>>> of "I'm doing it wrong" or maybe get some useful tips.
>>>>>>
>>>>>> In my prototype, I have a need to fan out a single tuple to several
>>>>>> bolts which do data retrieval from our database in parallel, which then 
>>>>>> get
>>>>>> merged back into a single stream.  These data retrieval bolts all find
>>>>>> various hibernate entities and pass them along to the merge bolt.  We've
>>>>>> written a kryo serializer that converts from the hibernate entities into
>>>>>> POJOs, which get sent to the merge bolt in tuples.  Once all the tuples 
>>>>>> get
>>>>>> to the merge bolt, it collects them all into a single tuple and passes it
>>>>>> downstream to a bolt which does processing using the entities.
>>>>>>
>>>>>> So it looks something like this.
>>>>>>
>>>>>>                       ---- (retrieve bolt a) ----
>>>>>>                     / ---- (retrieve bolt b) ----\
>>>>>>                    /------(retrieve bolt c) -----\
>>>>>> --- (split bolt)------(retrieve bolt d)-------(merge bolt) -----
>>>>>> (processing bolt)
>>>>>>
>>>>>> So dealing with detaching the hibernate entities from the session to
>>>>>> serialize them, and then further downstream when we want to work with the
>>>>>> entities again, we have to reattach them to a new session....this seems
>>>>>> kind of awkward.
>>>>>>
>>>>>> Does doing the above make sense?  Has anyone attempted to do the
>>>>>> above?  Any tips or things we should watch out for?  Basically looking 
>>>>>> for
>>>>>> any kind of input for this use case.
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Sincerely,
>>>> Fan Jiang
>>>>
>>>> IT Developer at RENCI
>>>> [email protected]
>>>>
>>>
>>>
>>
>
>
> --
> Sincerely,
> Fan Jiang
>
> IT Developer at RENCI
> [email protected]
>

Reply via email to