Thanks Murtadha,

Do I configure this property under [cc] inside cc.conf?

Best wishes,
Torsten
________________________________________
From: Murtadha Hubail <hubail...@gmail.com>
Sent: Sunday, November 17, 2019 1:50 PM
To: Torsten Bergh Moss; dev@asterixdb.apache.org
Subject: Re: Large UDFs

Torsten,

The maximum HTTP request size is configurable using the property 
(max.web.request.size) and by default it is set to 50MB.

Cheers,
Murtadha

On 11/17/2019, 3:34 PM, "Torsten Bergh Moss" <torsten.b.m...@ig.ntnu.no> wrote:

    I must say that I feel really confident that the problem has to do with the 
size of the UDF.

    I realized a lot of the dependencies actually were related to Asterix, thus 
redundant, so I solved the dependency problem by unapologetically cloning the 
repos for the external libraries my UDF is explicitly using and adding the code 
to the repo. It worked.

    However, my UDF is based on machine learning (Naive Bayes for sentiment 
analysis of Tweets), and is trained on about 900 000 tweets. The trained model 
manifests as large dictionaries containing term frequencies for the different 
classes/sentiments. So in order to use my UDF I either have to upload it with 
the training data or serialized versions of these dictionaries.

    And I can see that if I mvn package my UDF without these large files (.csv 
or .ser) it is "accepted" by the server when I send it via POST, but if I add 
these large files to the repo and then mvn package the UDF then the server 
rejects it because of file size. In other words, it seems to solely depend on 
the presence of these big files. And I mean it kind of makes sense as that is 
exactly what the cc.log file is saying: "A large request encountered. Closing 
channel."

    Best wishes,
    Torsten

    ________________________________________
    From: Xikui Wang <xik...@uci.edu>
    Sent: Sunday, November 17, 2019 12:21 AM
    To: dev@asterixdb.apache.org
    Subject: Re: Large UDFs

    I think the warning message that you see probably is orthogonal to the
    dependencies that you are trying to add, since the installation of UDF
    merely copies the jar files to a designated location for AsterixDB to
    discover. It shouldn't touch the code that raises the warning message.
    Maybe that's related to how you interacted with system? Not sure...

    As for handling large dependency libraries, besides making a fat jar, you
    can also copy the dependency jar files into the
    "apache-asterixdb-0.9.5-SNAPSHOT/repo" folder, so these jars can be
    deployed to the cluster together with AsterixDB and then be used by UDFs
    directly.

    Best,
    Xikui

    On Sat, Nov 16, 2019 at 2:55 PM Ian Maxon <ima...@uci.edu> wrote:

    > Sounds like a bug, can you share the UDF in question so I can debug it?
    >
    > > On Nov 16, 2019, at 05:17, Torsten Bergh Moss 
<torsten.b.m...@ig.ntnu.no>
    > wrote:
    > >
    > > Greetings devs,
    > >
    > >
    > > Hope you are all enjoying your weekends.
    > >
    > >
    > > I am trying to build a GPU-based UDF, and this UDF relies on a bunch of
    > dependencies (one of them being the GPU-framework). In order to "bake"
    > these dependencies into the UDF I am packaging it as a
    > jar-with-dependencies, however, this jar ends up being too big to deploy 
as
    > a UDF as the Hyracks Http Server cries out
    > >
    > >
    > > [nioEventLoopGroup-5-7] WARN
    > org.apache.hyracks.http.server.HttpRequestAggregator - A large request
    > encountered. Closing the channel.
    > >
    > >
    > > Is there any way to adjust these file size limits, or should UDFs with
    > dependencies be handled some other way? I looked into the
    > HttpRequestAggregator.java file and tried following some trails, but I
    > can't seem to discover where the limit is actually set.
    > >
    > >
    > > Best wishes,
    > >
    > > Torsten
    >



Reply via email to