Everything you said was correct, the server accepted my large UDF now, thank 
you!

Best wishes,
Torsten Bergh Moss
________________________________________
From: Murtadha Hubail <[email protected]>
Sent: Sunday, November 17, 2019 4:29 PM
To: Torsten Bergh Moss; [email protected]
Subject: Re: Large UDFs

Yes, and I believe it should go under the [common] config section. You will 
need to restart the asterixdb instance after that for the change to take 
effect. This property is configured in bytes. For example, if you want to set 
it to 100MB, it would be something like this:

[common]
max.web.request.size=104857600

Cheers,
Murtadha

On 11/17/2019, 6:17 PM, "Torsten Bergh Moss" <[email protected]> wrote:

    Thanks Murtadha,

    Do I configure this property under [cc] inside cc.conf?

    Best wishes,
    Torsten
    ________________________________________
    From: Murtadha Hubail <[email protected]>
    Sent: Sunday, November 17, 2019 1:50 PM
    To: Torsten Bergh Moss; [email protected]
    Subject: Re: Large UDFs

    Torsten,

    The maximum HTTP request size is configurable using the property 
(max.web.request.size) and by default it is set to 50MB.

    Cheers,
    Murtadha

    On 11/17/2019, 3:34 PM, "Torsten Bergh Moss" <[email protected]> 
wrote:

        I must say that I feel really confident that the problem has to do with 
the size of the UDF.

        I realized a lot of the dependencies actually were related to Asterix, 
thus redundant, so I solved the dependency problem by unapologetically cloning 
the repos for the external libraries my UDF is explicitly using and adding the 
code to the repo. It worked.

        However, my UDF is based on machine learning (Naive Bayes for sentiment 
analysis of Tweets), and is trained on about 900 000 tweets. The trained model 
manifests as large dictionaries containing term frequencies for the different 
classes/sentiments. So in order to use my UDF I either have to upload it with 
the training data or serialized versions of these dictionaries.

        And I can see that if I mvn package my UDF without these large files 
(.csv or .ser) it is "accepted" by the server when I send it via POST, but if I 
add these large files to the repo and then mvn package the UDF then the server 
rejects it because of file size. In other words, it seems to solely depend on 
the presence of these big files. And I mean it kind of makes sense as that is 
exactly what the cc.log file is saying: "A large request encountered. Closing 
channel."

        Best wishes,
        Torsten

        ________________________________________
        From: Xikui Wang <[email protected]>
        Sent: Sunday, November 17, 2019 12:21 AM
        To: [email protected]
        Subject: Re: Large UDFs

        I think the warning message that you see probably is orthogonal to the
        dependencies that you are trying to add, since the installation of UDF
        merely copies the jar files to a designated location for AsterixDB to
        discover. It shouldn't touch the code that raises the warning message.
        Maybe that's related to how you interacted with system? Not sure...

        As for handling large dependency libraries, besides making a fat jar, 
you
        can also copy the dependency jar files into the
        "apache-asterixdb-0.9.5-SNAPSHOT/repo" folder, so these jars can be
        deployed to the cluster together with AsterixDB and then be used by UDFs
        directly.

        Best,
        Xikui

        On Sat, Nov 16, 2019 at 2:55 PM Ian Maxon <[email protected]> wrote:

        > Sounds like a bug, can you share the UDF in question so I can debug 
it?
        >
        > > On Nov 16, 2019, at 05:17, Torsten Bergh Moss 
<[email protected]>
        > wrote:
        > >
        > > Greetings devs,
        > >
        > >
        > > Hope you are all enjoying your weekends.
        > >
        > >
        > > I am trying to build a GPU-based UDF, and this UDF relies on a 
bunch of
        > dependencies (one of them being the GPU-framework). In order to "bake"
        > these dependencies into the UDF I am packaging it as a
        > jar-with-dependencies, however, this jar ends up being too big to 
deploy as
        > a UDF as the Hyracks Http Server cries out
        > >
        > >
        > > [nioEventLoopGroup-5-7] WARN
        > org.apache.hyracks.http.server.HttpRequestAggregator - A large request
        > encountered. Closing the channel.
        > >
        > >
        > > Is there any way to adjust these file size limits, or should UDFs 
with
        > dependencies be handled some other way? I looked into the
        > HttpRequestAggregator.java file and tried following some trails, but I
        > can't seem to discover where the limit is actually set.
        > >
        > >
        > > Best wishes,
        > >
        > > Torsten
        >






Reply via email to