Re: [VOTE] Release Apache AsterixDB 0.9.6 and Hyracks 0.3.6 (RC0)

2020-10-13 Thread Xikui Wang
+1

- Verified the hashes of NCService Installer
- Tested Twitter feed with drop-in dependencies


On Mon, Oct 12, 2020 at 11:19 PM Taewoo Kim  wrote:

> [v] +1 release these packages as Apache AsterixDB 0.9.6 and Apache Hyracks
> 0.3.6
>
> - Verified signatures and hashes
> - Verified that source built correctly
> - Smoke test using binary
>
> On Mon, Oct 12, 2020 at 4:26 PM Mike Carey  wrote:
>
> > REMINDER:  Folks should please verify and vote! (That way our Fearless
> > Leader can perhaps include the outcome in the report that's due in 2
> > days. :-))
> >
> > On 10/4/20 11:51 PM, Ian Maxon wrote:
> > > Hi everyone,
> > >
> > > Please verify and vote on the latest release of Apache AsterixDB
> > >
> > > The change that produced this release is up for review on Gerrit:
> > >
> > > https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/8225
> > >
> > > The release artifacts are as follows:
> > >
> > > AsterixDB Source
> > >
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.6-source-release.zip
> > >
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.6-source-release.zip.asc
> > >
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.6-source-release.zip.sha256
> > >
> > > SHA256:98443ff5a8bb5b25b38fa81b1a4fb43aeb1522742164462909062d1cdf7cd88d
> > >
> > > Hyracks Source
> > >
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyracks-0.3.6-source-release.zip
> > >
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyracks-0.3.6-source-release.zip.asc
> > >
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyracks-0.3.6-source-release.zip.sha256
> > >
> > > SHA256:40546121dab77f49f29d74f9ae8138a0dc94daf8b6e4f6ed42e070d1981efdcb
> > >
> > > AsterixDB NCService Installer:
> > >
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-server-0.9.6-binary-assembly.zip
> > >
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-server-0.9.6-binary-assembly.zip.asc
> > >
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-server-0.9.6-binary-assembly.zip.sha256
> > >
> > > SHA256:6dd82a03cfa01891589c8e571892aa45b67548aaf2d1a03e5de179f1a38da5f5
> > >
> > > The KEYS file containing the PGP keys used to sign the release can be
> > > found at
> > >
> > > https://dist.apache.org/repos/dist/release/asterixdb/KEYS
> > >
> > > RAT was executed as part of Maven via the RAT maven plugin, but
> > > excludes files that are:
> > >
> > > - data for tests
> > > - procedurally generated,
> > > - or source files which come without a header mentioning their license,
> > >but have an explicit reference in the LICENSE file.
> > >
> > >
> > > The vote is open for 72 hours, or until the necessary number of votes
> > > (3 +1) has been reached.
> > >
> > > Please vote
> > > [ ] +1 release these packages as Apache AsterixDB 0.9.6 and
> > > Apache Hyracks 0.3.6
> > > [ ] 0 No strong feeling either way
> > > [ ] -1 do not release one or both packages because ...
> > >
> > > Thanks!
> >
>


Enable Trace via cc.conf

2020-07-09 Thread Xikui Wang
Hi Devs,

I'm trying to get the trace log for a certain class when deploying the
system on a cluster. I remember that we have a configuration in *cc.conf *to
enable the trace log for a specific class. Does anyone remember the right
way for specifying that? Thanks!

Best,
Xikui


Re: [VOTE] Release Apache AsterixDB 0.9.5 and Hyracks 0.3.5 (RC4)

2020-07-09 Thread Xikui Wang
[ X ] +1 release these packages as Apache AsterixDB 0.9.5 and
Apache Hyracks 0.3.5

- verified SHAs
- verified Twitter feed with drop-in jar dependencies

Best,
Xikui

On Thu, Jul 9, 2020 at 8:26 AM Wail Alkowaileet  wrote:

> [ X ] +1 release these packages as Apache AsterixDB 0.9.5 and
> Apache Hyracks 0.3.5
>
> - Verified signatures and hashes
> - Verified source build with Java 11
> - Ingested and query a few tweets and everything seems to be working
> correctly.
>
> On Wed, Jul 8, 2020 at 7:18 PM Taewoo Kim  wrote:
>
> > [ X] +1 release these packages as Apache AsterixDB 0.9.5 and Apache
> Hyracks
> > 0.3.5
> >
> > Followed the directions on the following page.
> >
> https://cwiki.apache.org/confluence/display/ASTERIXDB/Release+Verification
> >
> > [v] Verify signatures and hashes
> > [v] Verify that source builds correctly
> > [v] Smoke test
> >
> > Best,
> > Taewoo
> >
> >
> > On Wed, Jul 8, 2020 at 6:04 AM Michael Blow 
> > wrote:
> >
> > > [ X ] +1 release these packages as Apache AsterixDB 0.9.5 and
> > > Apache Hyracks 0.3.5
> > >
> > > Checked:
> > > - keys, signatures on all packages
> > > - SHAs
> > > - sanity check of LICENSE / NOTICEs
> > > - functional build of source packages
> > > - all versions advanced from SNAPSHOT
> > >
> > >
> > > On Mon, Jul 6, 2020 at 6:51 PM Ian Maxon  wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > Please verify and vote on the latest release of Apache AsterixDB
> > > >
> > > > The change that produced this release is up for review on Gerrit:
> > > >
> > > > https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/7124
> > > >
> > > > The release artifacts are as follows:
> > > >
> > > > AsterixDB Source
> > > >
> > > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.5-source-release.zip
> > > >
> > > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.5-source-release.zip.asc
> > > >
> > > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.5-source-release.zip.sha256
> > > >
> > > >
> SHA256:09affe9ce5aa75add6c5a75c51505e619f85cb7a87eb3b9d977ac472d5387bd1
> > > >
> > > > Hyracks Source
> > > >
> > > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyracks-0.3.5-source-release.zip
> > > >
> > > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyracks-0.3.5-source-release.zip.asc
> > > >
> > > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyracks-0.3.5-source-release.zip.sha256
> > > >
> > > >
> SHA256:577d2b3da91ebfa37c113bae18561dcbfae0bdd526edee604b747f6044f4a03b
> > > >
> > > > AsterixDB NCService Installer:
> > > >
> > > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-server-0.9.5-binary-assembly.zip
> > > >
> > > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-server-0.9.5-binary-assembly.zip.asc
> > > >
> > > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-server-0.9.5-binary-assembly.zip.sha256
> > > >
> > > >
> SHA256:6854e71fc78f9cfb68b0dc3c61edb5f5c94b09b41f4a8deaf4c2fc9d804abcac
> > > >
> > > > The KEYS file containing the PGP keys used to sign the release can be
> > > > found at
> > > >
> > > > https://dist.apache.org/repos/dist/release/asterixdb/KEYS
> > > >
> > > > RAT was executed as part of Maven via the RAT maven plugin, but
> > > > excludes files that are:
> > > >
> > > > - data for tests
> > > > - procedurally generated,
> > > > - or source files which come without a header mentioning their
> license,
> > > >   but have an explicit reference in the LICENSE file.
> > > >
> > > >
> > > > The vote is open for 72 hours, or until the necessary number of votes
> > > > (3 +1) has been reached.
> > > >
> > > > Please vote
> > > > [ ] +1 release these packages as Apache AsterixDB 0.9.5 and
> > > > Apache Hyracks 0.3.5
> > > > [ ] 0 No strong feeling either way
> > > > [ ] -1 do not release one or both packages because ...
> > > >
> > > > Thanks!
> > > >
> > >
> >
>
>
> --
>
> *Regards,*
> Wail Alkowaileet
>


Re: Adding functions from extensions

2020-05-29 Thread Xikui Wang
Ah! I see. Adding functions from the customized *IntegrationUtil.java does
the trick. Thanks a lot!

Best,
Xikui

On Fri, May 29, 2020 at 2:11 PM Dmitry Lychagin
 wrote:

> Xikui,
>
> Do you mean adding functions to BuiltinFunctions?
> BuiltinFunctions.addFunction() is public, so could be called from any of
> your custom initialization code to register new functions there.
>
> Thanks,
> -- Dmitry
>
>
> On 5/28/20, 7:45 PM, "Xikui Wang"  wrote:
>
> Hi Devs,
>
> I'm trying to add some functions from the BAD extension to AsterixDB. I
> found there are two ways of adding runtime entities into the
> FunctionCollection: 1) Adding an IFunctionRegistrant, 2) Implementing
> an
> extension function manger. However, I didn't find a way to add
> functions to
> Metadata. I wonder there is an existing method for doing so? I did
> something similar to the IFunctionRegistrant in the
> MetadataBuiltinFunctions on my branch. If we don't have an existing
> method,
> I can try to create a patch on this.
>
> Best,
> Xikui
>
>


Adding functions from extensions

2020-05-28 Thread Xikui Wang
Hi Devs,

I'm trying to add some functions from the BAD extension to AsterixDB. I
found there are two ways of adding runtime entities into the
FunctionCollection: 1) Adding an IFunctionRegistrant, 2) Implementing an
extension function manger. However, I didn't find a way to add functions to
Metadata. I wonder there is an existing method for doing so? I did
something similar to the IFunctionRegistrant in the
MetadataBuiltinFunctions on my branch. If we don't have an existing method,
I can try to create a patch on this.

Best,
Xikui


Re: Micro-batch semantics for UDFs

2020-03-12 Thread Xikui Wang
Hi Torsten,

I've sent an invitation to your email address for the code repository. It's
under the "xikui_idea" branch. Let me know if you have any issues accessing
it. You can find all the classes you need by searching for
"PartitionHolders". :)

To fully achieve what you want to do, I think you will probably also need
to customize the UDF framework in AsterixDB to enable a Java UDF to take a
batch of records. That would require some additional work. Or you can just
wrap your GPU driver function as a special operator in AsterixDB and
connect that to the partition holders. That's just my two cents. You can
play with the codebase to see which option works best for you.

Sorry for the late reply. Things are getting hectic recently with the
Coronavirus situation. I hope everyone can stay safe and healthy during
this time.

Best,
Xikui

On Thu, Mar 12, 2020 at 1:27 PM Torsten Bergh Moss <
torsten.b.m...@ig.ntnu.no> wrote:

> Xikui, how can I get started with reusing the PartitionHolder you used for
> the ingestion project?
>
> Best wishes,
> Torsten
> 
> From: Torsten Bergh Moss 
> Sent: Sunday, March 8, 2020 5:15 PM
> To: dev@asterixdb.apache.org
> Subject: Re: Micro-batch semantics for UDFs
>
> Thanks for the feedback, and sorry for the late response, I've been busy
> with technical interviews.
>
> Xikui, the ingestion framework described in section 5 & 6 of your paper
> sounds perfect for my project. I could have an intake job receiving a
> stream of tweets and an insert job pulling batches of say 10k tweets from
> the intake job, preprocess the batch, run it through the neural network on
> the GPU to get the sentiments, then write the tweets with their sentiments
> to a dataset. Unless there are any unforeseen bottlenecks I think I should
> be able to achieve throughputs of up to 20k tweets per second with my
> current setup.
>
> Is the code related to your project available on a specific branch or in a
> separate repo maybe?
>
> Also, I believe there might be missing a figure, revealed by the line "The
> decoupled ingestion framework is shown in Figure ??" early on page 8.
>
> Best wishes,
> Torsten
>
> 
> From: Xikui Wang 
> Sent: Sunday, March 1, 2020 5:41 AM
> To: dev@asterixdb.apache.org
> Subject: Re: Micro-batch semantics for UDFs
>
> Hi Torsten,
>
> In case you want to customize the UDF framework to trigger your UDF on a
> batch of records, you could consider reusing the PartitionHolder that I did
> for my enrichment for the ingestion project. It takes a number of records,
> processes them, and returns with the processed results. I used them to
> enable hash joins on feeds and refreshes reference data per batch. That
> might be helpful. You can find more information here [1].
>
> [1] https://arxiv.org/pdf/1902.08271.pdf
>
> Best,
> Xikui
>
> On Thu, Feb 27, 2020 at 2:35 PM Dmitry Lychagin
>  wrote:
>
> > Torsten,
> >
> > I see a couple of possible approaches here:
> >
> > 1. Make your function operate on arrays of values instead of primitive
> > values.
> > You'll probably need to have a GROUP BY in your query to create an array
> > (using ARRAY_AGG() or GROUP AS variable).
> > Then pass that array to your function which would process it and would
> > also return a result array.
> > Then unnest that output  array to get the cardinality back.
> >
> > 2. Alternatively,  you could try creating a new runtime for ASSIGN
> > operator that'd pass batches of input tuples to a new kind of function
> > evaluator.
> > You'll need to provide replacements for
> > AssignPOperator/AssignRuntimeFactory.
> > Also you'd need to modify InlineVariablesRule[1] so it doesn't inline
> > those ASSIGNS.
> >
> > [1]
> >
> https://github.com/apache/asterixdb/blob/master/hyracks-fullstack/algebricks/algebricks-rewriter/src/main/java/org/apache/hyracks/algebricks/rewriter/rules/InlineVariablesRule.java#L144
> >
> > Thanks,
> > -- Dmitry
> >
> >
> > On 2/27/20, 2:02 PM, "Torsten Bergh Moss" 
> > wrote:
> >
> > Greetings everyone,
> >
> >
> > I'm experimenting a lot with UDF's utilizing Neural Network
> inference,
> > mainly for classification of tweets. Problem is, running the UDF's in a
> > one-at-a-time fashion severely under-exploits the capacity of GPU-powered
> > NN's, as well as there being a certain latency associated with moving
> data
> > from the CPU to the GPU and back every time the UDF is called, causing
> for
> > poor performance.
> >
> >
> > Ide

Re: Micro-batch semantics for UDFs

2020-02-29 Thread Xikui Wang
Hi Torsten,

In case you want to customize the UDF framework to trigger your UDF on a
batch of records, you could consider reusing the PartitionHolder that I did
for my enrichment for the ingestion project. It takes a number of records,
processes them, and returns with the processed results. I used them to
enable hash joins on feeds and refreshes reference data per batch. That
might be helpful. You can find more information here [1].

[1] https://arxiv.org/pdf/1902.08271.pdf

Best,
Xikui

On Thu, Feb 27, 2020 at 2:35 PM Dmitry Lychagin
 wrote:

> Torsten,
>
> I see a couple of possible approaches here:
>
> 1. Make your function operate on arrays of values instead of primitive
> values.
> You'll probably need to have a GROUP BY in your query to create an array
> (using ARRAY_AGG() or GROUP AS variable).
> Then pass that array to your function which would process it and would
> also return a result array.
> Then unnest that output  array to get the cardinality back.
>
> 2. Alternatively,  you could try creating a new runtime for ASSIGN
> operator that'd pass batches of input tuples to a new kind of function
> evaluator.
> You'll need to provide replacements for
> AssignPOperator/AssignRuntimeFactory.
> Also you'd need to modify InlineVariablesRule[1] so it doesn't inline
> those ASSIGNS.
>
> [1]
> https://github.com/apache/asterixdb/blob/master/hyracks-fullstack/algebricks/algebricks-rewriter/src/main/java/org/apache/hyracks/algebricks/rewriter/rules/InlineVariablesRule.java#L144
>
> Thanks,
> -- Dmitry
>
>
> On 2/27/20, 2:02 PM, "Torsten Bergh Moss" 
> wrote:
>
> Greetings everyone,
>
>
> I'm experimenting a lot with UDF's utilizing Neural Network inference,
> mainly for classification of tweets. Problem is, running the UDF's in a
> one-at-a-time fashion severely under-exploits the capacity of GPU-powered
> NN's, as well as there being a certain latency associated with moving data
> from the CPU to the GPU and back every time the UDF is called, causing for
> poor performance.
>
>
> Ideally it would be possible use the UDF to process records in a
> micro-batch fashion, letting them accumulate until a certain batch-size is
> reached (as big as my GPU's memory can handle) before passing the data
> along to the neural network to get the outputs.
>
>
> Is there a way to accomplish this with the current UDF framework
> (either in java or python)? If not, where would I have to start to develop
> such a feature?
>
>
> Best wishes,
>
> Torsten Bergh Moss
>
>
>


Re: UDF Lifecycle

2019-11-17 Thread Xikui Wang
I wonder what would the deployment-initialization do?

btw, the UDF does have a deinitialize() method which is expected to be
invoked when the UDF is deinitialized, but that's is ignored for now as the
IScalarEvaluator in general doesn't not deinitialize. To make that work, we
would need a bigger change in Hyracks to make it aware that step. This
could one improvement as well...

Best,
Xikui

On Sun, Nov 17, 2019 at 11:30 AM Till Westmann  wrote:

> It seems that it's be nice if we had a step (similar to the
> initialization step) in the deployment lifecycle as well.
> And I guess that we'd need to corresponding clean-up step for
> un-deployment as well.
>
> Does that make sense? If so, should we file an improvement for this?
>
> Cheers,
> Till
>
> On 17 Nov 2019, at 9:29, Xikui Wang wrote:
>
> > The UDF interface has an initialize method which is invoked per every
> > lifecycle. Putting the model loading code in there can probably solve
> > your
> > problem. The initialization is done per query (Hyrack job). For
> > example, if
> > you do
> >
> > SELECT mylib#myudf(t) FROM Tweets t;
> >
> > in which there are 100 tweets in the Tweets dataset. The
> > initialization
> > method will be called once and the evaluate method will be invoked 100
> > times. In the context of feeds attached with UDFs, the
> > initialization happens only once when feed starts.
> >
> > Best,
> > Xikui
> >
> > On Sun, Nov 17, 2019 at 6:44 AM Torsten Bergh Moss <
> > torsten.b.m...@ig.ntnu.no> wrote:
> >
> >> Dear developers,
> >>
> >>
> >> I am trying to build a machine learning-based UDF for classification.
> >> This
> >> involves loading in a model that has been trained offline, which in
> >> practice basically is deserialization of a big object. This process
> >> of
> >> deserialization takes a significant amount of time, but it only
> >> "needs" to
> >> happen once, and after that the model can do the classification
> >> rather
> >> rapidly.
> >>
> >>
> >> Therefore, in order to avoid having to load the model every time the
> >> UDF
> >> is called, I am wondering where in the UDF lifecycle I can do the
> >> loading
> >> in order to achieve a "load model once, classify
> >> infinitely"-scenario, and
> >> how to implement it. I am assuming it should be done somewhere inside
> >> the
> >> factory-function-relationship, but I am not sure where/how and can't
> >> seem
> >> to find a lot of documentation on it.
> >>
> >>
> >> All help is appreciated, thanks!
> >>
> >>
> >> Best wishes,
> >>
> >> Torsten
> >>
>


Re: UDF Lifecycle

2019-11-17 Thread Xikui Wang
The UDF interface has an initialize method which is invoked per every
lifecycle. Putting the model loading code in there can probably solve your
problem. The initialization is done per query (Hyrack job). For example, if
you do

SELECT mylib#myudf(t) FROM Tweets t;

in which there are 100 tweets in the Tweets dataset. The initialization
method will be called once and the evaluate method will be invoked 100
times. In the context of feeds attached with UDFs, the
initialization happens only once when feed starts.

Best,
Xikui

On Sun, Nov 17, 2019 at 6:44 AM Torsten Bergh Moss <
torsten.b.m...@ig.ntnu.no> wrote:

> Dear developers,
>
>
> I am trying to build a machine learning-based UDF for classification. This
> involves loading in a model that has been trained offline, which in
> practice basically is deserialization of a big object. This process of
> deserialization takes a significant amount of time, but it only "needs" to
> happen once, and after that the model can do the classification rather
> rapidly.
>
>
> Therefore, in order to avoid having to load the model every time the UDF
> is called, I am wondering where in the UDF lifecycle I can do the loading
> in order to achieve a "load model once, classify infinitely"-scenario, and
> how to implement it. I am assuming it should be done somewhere inside the
> factory-function-relationship, but I am not sure where/how and can't seem
> to find a lot of documentation on it.
>
>
> All help is appreciated, thanks!
>
>
> Best wishes,
>
> Torsten
>


Re: Questionable UDF behaviour

2019-11-16 Thread Xikui Wang
Hi Torsten,

Sorry about the confusion. The issue problem that you see if because of a
minor bug in the template. At line 60 of the sentiment function, we should
create a new JString object instead of getting a JString from the function
helper. This would cause this variable be reclaimed by the function helper
for parameter setting later and mess up the data. To fix this, replace line
60 with the following code should resolve your issue. The template repo is
updated as well.

jString = new JString("");

Best,
Xikui

On Sat, Nov 16, 2019 at 7:05 AM Torsten Bergh Moss <
torsten.b.m...@ig.ntnu.no> wrote:

> I guess my attempt to inline screenshots of code and results in order to
> not have to worry about text-formatting ?failed miserably.
>
>
> Code:
>
> @Override
> public void evaluate(IFunctionHelper functionHelper) throws Exception {
> // Read input record
> JRecord inputRecord = (JRecord) functionHelper.getArgument(0);
>
> JLong id = (JLong) inputRecord.getValueByName("id");
> JString text = (JString) inputRecord.getValueByName("text");
>
> // Populate result record
> JRecord result = (JRecord) functionHelper.getResultObject();
> result.setField("id", id);
> result.setField("text", text);
>
> if (text.getValue().length() > 66) {
> this.jString.setValue("Amazing!");
> } else {
> this.jString.setValue("Boring!");
> }
> result.setField("Sentiment", jString);
> functionHelper.setResult(result);
> }
>
>
> Results:
>
> { "ProcessedTweet": { "id": 1170705127629611008, "text": "la verdad q si",
> "Sentiment": "Boring!" } }
> { "ProcessedTweet": { "id": 1170705134428532736, "text": "Amazing!",
> "Sentiment": "Amazing!" } }
> { "ProcessedTweet": { "id": 1170705158998593541, "text": "Amazing!",
> "Sentiment": "Amazing!" } }
> { "ProcessedTweet": { "id": 1170705204574085121, "text": "Amazing!",
> "Sentiment": "Amazing!" } }
> { "ProcessedTweet": { "id": 1170705245414051842, "text": "Amazing!",
> "Sentiment": "Amazing!" } }
> { "ProcessedTweet": { "id": 1170705264921776129, "text": "Amazing!",
> "Sentiment": "Amazing!" } }
> { "ProcessedTweet": { "id": 1170705288711852033, "text": "Amazing!",
> "Sentiment": "Amazing!" } }
> { "ProcessedTweet": { "id": 1170705318881505280, "text": "Amazing!",
> "Sentiment": "Amazing!" } }
> { "ProcessedTweet": { "id": 1170705358068887558, "text": "Amazing!",
> "Sentiment": "Amazing!" } }
> { "ProcessedTweet": { "id": 1170705359985684481, "text": "Amazing!",
> "Sentiment": "Amazing!" } }
> { "ProcessedTweet": { "id": 1170705373050941440, "text": "Amazing!",
> "Sentiment": "Amazing!" } }
> { "ProcessedTweet": { "id": 1170705421151154177, "text": "Amazing!",
> "Sentiment": "Amazing!" } }
> { "ProcessedTweet": { "id": 1170705470966894592, "text": "Amazing!",
> "Sentiment": "Amazing!" } }
> { "ProcessedTweet": { "id": 1170705480815140865, "text": "Amazing!",
> "Sentiment": "Amazing!" } }?
>
>
> Best wishes,
>
> Torsten
>
> 
> From: Torsten Bergh Moss 
> Sent: Saturday, November 16, 2019 3:59 PM
> To: dev@asterixdb.apache.org
> Subject: Questionable UDF behaviour
>
>
> Hi it's me again,
>
>
> I know it's been mentioned to me before that the examples in ?
> https://github.com/idleft/asterix-udf-template/ are for template-purposes
> and not meant to actually be used, however, I hope somebody can shed light
> upon this behaviour that I can't wrap my head around.
>
>
> I am running the sample sentiment function,
> https://github.com/idleft/asterix-udf-template/blob/master/src/main/java/org/apache/asterix/external/library/SentimentFunction.java
> .
>
>
> The only change I've made from the original code has been changing the
> casting from JString to JLong on line 40 as the ID from tweets saved in
> AsterixDB are numbers (and will provoke an error if treated as JStrings).
>
>
> [cid:cac38053-f399-4eaf-9c40-f745d4bc2ef5]
>
>
> When running the UDF on my dataset of tweets it only works as expected on
> the first tweet?, for the rest of them it also changes the value of the
> text-field (which should equal the text in the original tweet), like this:
>
>
> [cid:127092ca-c89e-4ae1-8696-f752cc2d991c]
>
> The SQL++ command I am using to produce these results is
>
>
> SELECT testlib#getSentiment(t) AS ProcessedTweet FROM (SELECT id, text
> FROM Tweets) AS t;
>
>
> Looking at the code, the UDF gets the text-value on line 41 and populates
> the text-field on the result-object on line 46. Can somebody give me some
> hints about why the text-field is set to the same value as the
> sentiment-field?
>
>
> Best wishes,
>
> Torsten
>


Re: Large UDFs

2019-11-16 Thread Xikui Wang
I think the warning message that you see probably is orthogonal to the
dependencies that you are trying to add, since the installation of UDF
merely copies the jar files to a designated location for AsterixDB to
discover. It shouldn't touch the code that raises the warning message.
Maybe that's related to how you interacted with system? Not sure...

As for handling large dependency libraries, besides making a fat jar, you
can also copy the dependency jar files into the
"apache-asterixdb-0.9.5-SNAPSHOT/repo" folder, so these jars can be
deployed to the cluster together with AsterixDB and then be used by UDFs
directly.

Best,
Xikui

On Sat, Nov 16, 2019 at 2:55 PM Ian Maxon  wrote:

> Sounds like a bug, can you share the UDF in question so I can debug it?
>
> > On Nov 16, 2019, at 05:17, Torsten Bergh Moss 
> wrote:
> >
> > Greetings devs,
> >
> >
> > Hope you are all enjoying your weekends.
> >
> >
> > I am trying to build a GPU-based UDF, and this UDF relies on a bunch of
> dependencies (one of them being the GPU-framework). In order to "bake"
> these dependencies into the UDF I am packaging it as a
> jar-with-dependencies, however, this jar ends up being too big to deploy as
> a UDF as the Hyracks Http Server cries out
> >
> >
> > [nioEventLoopGroup-5-7] WARN
> org.apache.hyracks.http.server.HttpRequestAggregator - A large request
> encountered. Closing the channel.
> >
> >
> > Is there any way to adjust these file size limits, or should UDFs with
> dependencies be handled some other way? I looked into the
> HttpRequestAggregator.java file and tried following some trails, but I
> can't seem to discover where the limit is actually set.
> >
> >
> > Best wishes,
> >
> > Torsten
>


Re: Running the template sentiment analysis UDF

2019-10-11 Thread Xikui Wang
This patch has just been merged into the master branch. If you check out
the latest master, this issue should be gone.

Best,
Xikui

On Tue, Oct 8, 2019 at 9:55 AM Xikui Wang  wrote:

> Hi Torsten,
>
> The sentiment function in the template is for demo purposes and is not
> expected to be applied on Tweets directly. You need to modify it to work
> with Tweets.
>
> The error that you see is because of a bug in mapping an incoming Tweet to
> a Java record that is to be used in UDF. Although you didn't use Array
> explicitly, the Tweets coming from Twitter do contain arrays. This is a
> known issue, and I have submitted a patch for this but it hasn't been
> merged yet. You can get the fix at [1].
>
> [1] https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/3405
>
> Best,
> Xikui
>
> On Tue, Oct 8, 2019 at 7:47 AM Torsten Bergh Moss <
> torsten.b.m...@ig.ntnu.no> wrote:
>
>> Hi!
>>
>>
>> I built a dataset of tweets using the twitter feed adaptor, and now I am
>> trying to run the tweets through the sample sentiment analysis UDF from the
>> template:
>>
>>
>>
>> https://github.com/idleft/asterix-udf-template/blob/master/src/main/java/org/apache/asterix/external/library/SentimentFunction.java
>>
>>
>> First I got an error about not being able to cast a JLong to a JString,
>> but I fixed it quickly by making the id JFloat on line 40.
>>
>>
>> Now I'm getting a RunTimeDataException stating "Cannot parse list item of
>> type array", however I cannot seem to find a use of neither arrays nor
>> lists in the function. I've also scanned the logs for clues without any
>> luck.
>>
>>
>> Any pointers in the right direction would be highly appreciated.
>>
>>
>> Best wishes,
>>
>> Torsten Bergh Moss
>>
>


Re: Running the template sentiment analysis UDF

2019-10-08 Thread Xikui Wang
Hi Torsten,

The sentiment function in the template is for demo purposes and is not
expected to be applied on Tweets directly. You need to modify it to work
with Tweets.

The error that you see is because of a bug in mapping an incoming Tweet to
a Java record that is to be used in UDF. Although you didn't use Array
explicitly, the Tweets coming from Twitter do contain arrays. This is a
known issue, and I have submitted a patch for this but it hasn't been
merged yet. You can get the fix at [1].

[1] https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/3405

Best,
Xikui

On Tue, Oct 8, 2019 at 7:47 AM Torsten Bergh Moss 
wrote:

> Hi!
>
>
> I built a dataset of tweets using the twitter feed adaptor, and now I am
> trying to run the tweets through the sample sentiment analysis UDF from the
> template:
>
>
>
> https://github.com/idleft/asterix-udf-template/blob/master/src/main/java/org/apache/asterix/external/library/SentimentFunction.java
>
>
> First I got an error about not being able to cast a JLong to a JString,
> but I fixed it quickly by making the id JFloat on line 40.
>
>
> Now I'm getting a RunTimeDataException stating "Cannot parse list item of
> type array", however I cannot seem to find a use of neither arrays nor
> lists in the function. I've also scanned the logs for clues without any
> luck.
>
>
> Any pointers in the right direction would be highly appreciated.
>
>
> Best wishes,
>
> Torsten Bergh Moss
>


Re: [VOTE] Release Apache AsterixDB 0.9.5 and Hyracks 0.3.5 (RC3)

2019-09-13 Thread Xikui Wang
+1

- Tested drop-in Twitter4J dependencies
- Tested Twitter feed


On Fri, Sep 13, 2019 at 10:08 AM Chen Li  wrote:

> +1
>
> On Thu, Sep 12, 2019 at 4:50 PM Mike Carey  wrote:
>
> > +1
> >
> > - Successfully did NCService install and ran through the SQL++ 101
> > exercises
> >
> > On 9/12/19 3:47 PM, Wail Alkowaileet wrote:
> > > +1
> > >
> > > - Signatures and hashes ok.
> > > - NCService binary works.
> > > - Source compilation works.
> > > - Executed the sample cluster. Ingested tweets and run few queries.
> > >
> > > On Tue, Sep 3, 2019 at 6:02 PM Ian Maxon  wrote:
> > >
> > >> Hi everyone,
> > >>
> > >> Please verify and vote on the latest release of Apache AsterixDB. This
> > >> candidate fixes the binary name and missing Netty notice from RC2.
> > >>
> > >> The change that produced this release and the change to advance the
> > >> version are
> > >> up for review on Gerrit:
> > >>
> > >>
> > >>
> >
> https://asterix-gerrit.ics.uci.edu/#/q/status:open+owner:%22Jenkins+%253Cjenkins%2540fulliautomatix.ics.uci.edu%253E%22
> > >>
> > >> The release artifacts are as follows:
> > >>
> > >> AsterixDB Source
> > >>
> > >>
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.5-source-release.zip
> > >>
> > >>
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.5-source-release.zip.asc
> > >>
> > >>
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.5-source-release.zip.sha256
> > >>
> > >>
> SHA256:be41051e803e5ada2c64f608614c6476c6686e043c47a2a0291ccfd25239a679
> > >>
> > >> Hyracks Source
> > >>
> > >>
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyracks-0.3.5-source-release.zip
> > >>
> > >>
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyracks-0.3.5-source-release.zip.asc
> > >>
> > >>
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyracks-0.3.5-source-release.zip.sha256
> > >>
> > >>
> SHA256:b06fe983aa6837abe3460a157d7600662ec56181a43db317579f5c7ddf9bfc08
> > >>
> > >> AsterixDB NCService Installer:
> > >>
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.5.zip
> > >>
> > >>
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.5.zip.asc
> > >>
> > >>
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.5.zip.sha256
> > >>
> > >>
> > >> SHA256:
> > >>
> > >> The KEYS file containing the PGP keys used to sign the release can be
> > >> found at
> > >>
> > >> https://dist.apache.org/repos/dist/release/asterixdb/KEYS
> > >>
> > >> RAT was executed as part of Maven via the RAT maven plugin, but
> > >> excludes files that are:
> > >>
> > >> - data for tests
> > >> - procedurally generated,
> > >> - or source files which come without a header mentioning their
> license,
> > >>but have an explicit reference in the LICENSE file.
> > >>
> > >>
> > >> The vote is open for 72 hours, or until the necessary number of votes
> > >> (3 +1) has been reached.
> > >>
> > >> Please vote
> > >> [ ] +1 release these packages as Apache AsterixDB 0.9.5 and
> > >> Apache Hyracks 0.3.5
> > >> [ ] 0 No strong feeling either way
> > >> [ ] -1 do not release one or both packages because ...
> > >>
> > >> Thanks!
> > >>
> > >
> >
>


Re: [VOTE] Release Apache AsterixDB 0.9.5 and Hyracks 0.3.5 (RC1)

2019-07-09 Thread Xikui Wang
-1

Creating feeds failed for the same exception as I've mentioned last time.
Creating a feed with any adapter would throw the following exception:

Error: Invalid feed parameters. Exception Message:ASX3083: Duplicate
feed adaptor name: twitter_pull

Note that the DatasourceFactoryProvider in feeds reads a resource file to
load all existing record readers. Somehow this resource file is loaded
twice, or it is replicated, in the binary release. The reason it complains
about "twitter_pull" is that the Twitter record reader is the first one in
the resource file.

Best,
Xikui


On Tue, Jul 9, 2019 at 6:53 AM Mike Carey  wrote:

> +1
>
>   - Downloaded the binary
>
>   - Ran it through the SQL++ tutorial successfully
>
>
> On 7/3/19 9:02 AM, Taewoo Kim wrote:
> > [V] +1 release these packages as Apache AsterixDB 0.9.5 and Apache
> Hyracks
> > 0.3.5
> >
> > - Checked the SHA256 of the zip files.
> > - Builds were successful without any error.
> > - Smoke test using the binary was successful.
> >
> > Best,
> > Taewoo
> >
> >
> > On Tue, Jul 2, 2019 at 12:25 AM Ian Maxon  wrote:
> >
> >> Hi everyone,
> >>
> >> Please verify and vote on the latest release of Apache AsterixDB
> >>
> >> The change that produced this release and the change to advance the
> >> version are
> >> up for review on Gerrit:
> >>
> >>
> >>
> https://asterix-gerrit.ics.uci.edu/#/q/status:open+owner:%22Jenkins+%253Cjenkins%2540fulliautomatix.ics.uci.edu%253E%22
> >>
> >> The release artifacts are as follows:
> >>
> >> AsterixDB Source
> >>
> >>
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.5-source-release.zip
> >>
> >>
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.5-source-release.zip.asc
> >>
> >>
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.5-source-release.zip.sha256
> >>
> >> SHA256:35c0249bf7d8bb5868016589018eefb91c0dfde1f6b06001e859ffb3d9144638
> >>
> >> Hyracks Source
> >>
> >>
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyracks-0.3.5-source-release.zip
> >>
> >>
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyracks-0.3.5-source-release.zip.asc
> >>
> >>
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyracks-0.3.5-source-release.zip.sha256
> >>
> >> SHA256:ee2eda7e9ff03978e21b4fc0db33854475d6dba70b5346b5e78308bf6d8efc72
> >>
> >> AsterixDB NCService Installer:
> >>
> >>
> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-server-0.9.5-binary-assembly.zip
> >>
> >>
> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-server-0.9.5-binary-assembly.zip.asc
> >>
> >>
> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-server-0.9.5-binary-assembly.zip.sha256
> >>
> >> SHA256:e03f48e410e2ff84fc4b63f32bb50171747c5e0088845b4b462693f239c7d794
> >>
> >> The KEYS file containing the PGP keys used to sign the release can be
> >> found at
> >>
> >> https://dist.apache.org/repos/dist/release/asterixdb/KEYS
> >>
> >> RAT was executed as part of Maven via the RAT maven plugin, but
> >> excludes files that are:
> >>
> >> - data for tests
> >> - procedurally generated,
> >> - or source files which come without a header mentioning their license,
> >>but have an explicit reference in the LICENSE file.
> >>
> >>
> >> The vote is open for 72 hours, or until the necessary number of votes
> >> (3 +1) has been reached.
> >>
> >> Please vote
> >> [ ] +1 release these packages as Apache AsterixDB 0.9.5 and
> >> Apache Hyracks 0.3.5
> >> [ ] 0 No strong feeling either way
> >> [ ] -1 do not release one or both packages because ...
> >>
> >> Thanks!
> >>
>


Re: [VOTE] Release Apache AsterixDB 0.9.5 and Hyracks 0.3.5 (RC0)

2019-06-28 Thread Xikui Wang
-1

I've got the following error message when creating feeds:
Error: Invalid feed parameters. Exception Message:ASX3083: Duplicate feed
adaptor name: twitter_pull
This error happens when I tried to create a Twitter feed or a socket feed.

When trying to figure out the issue, I downloaded the source release. The
pom.xml of asterixdb has the wrong Hyracks and Algebricks versions. Both
should not use the snapshot version.
After fixing the pom file and rebuild the project, creating feeds on the
new build doesn't have the first issue. Maybe these two issues are related?

Best,
Xikui

On Fri, Jun 28, 2019 at 11:55 AM Mike Carey  wrote:

> I ran the system through its SQL++ tutorial paces:
>
> +1 release these packages as Apache AsterixDB 0.9.5 and Apache Hyracks
> 0.3.5
>
> On 6/25/19 6:19 PM, Ian Maxon wrote:
> > [ ] +1 release these packages as Apache AsterixDB 0.9.5 and
> > Apache Hyracks 0.3.5
>


Mailing list is down for UCI?

2019-04-25 Thread Xikui Wang
Hi Devs,

This is a test message, as we found the mailing list seems stopped working
for those who are using the UCI email address...

Best,
Xikui


Re: Twitter adapter

2019-04-08 Thread Xikui Wang
Hi Sandra,

I haven't seen this problem before. Which AsterixDB version are you using?
If it's ok, please send me your DDLs with your configuration in a private
thread, so I can try that on my end.

Best,
Xikui

On Sat, Apr 6, 2019 at 2:15 AM sandraskarsh...@gmail.com <
sandraskarsh...@gmail.com> wrote:

>
>
> On 2019/04/05 17:59:20, Xikui Wang  wrote:
> > Hi Sandra,
> >
> > That's not an error but a reminder (which we probably should refactor
> > later) says the feed will try to use the configurations from the feed
> > definition. You don't really need to extend the access of your Twitter
> dev
> > account. As long as you follow the DDL template in the doc, you should be
> > fine. Here is the snippet from the documentation [1]:
> >
> > use feeds;
> >
> > create feed TwitterFeed with {
> >   "adapter-name": "push_twitter",
> >   "type-name": "Tweet",
> >   "format": "twitter-status",
> >   "consumer.key": "",
> >   "consumer.secret": "",
> >   "access.token": "**",
> >   "access.token.secret": "*"
> > };
> >
> >
> > Best,
> > Xikui
> >
> > On Fri, Apr 5, 2019 at 8:08 AM sandraskarsh...@gmail.com <
> > sandraskarsh...@gmail.com> wrote:
> >
> > > Hi!
> > >
> > > When using the built-in push-based Twitter adapter, which endpoint in
> the
> > > Twitter API is it actually making a request to? I have obtained a
> developer
> > > account at Twitter, but I am not sure if I need any extended access
> than
> > > what provided for the normal developer accounts in order to use the
> Twitter
> > > adapter.
> > >
> > > When trying to start the feed which uses the Twitter adapter, I find
> the
> > > following error in the cc.log, and the query never seem to finish its
> > > execution:
> > >
> > > [QueryTranslator] WARN  org.apache.asterix.external.util.TwitterUtil -
> > > unable to load authentication credentials from auth.properties
> > > filecredential information will be obtained from adapter's
> configuration
> > >
> > > Thanks in advance!
> > >
> > > Sandra
> > >
> >
> Hi Xikui, thank you for your reply!
>
> I am following the same example as in the documentation, and the query for
> creating the feed completes successfully. However, when trying to execute
> the query which should start the feed, the interface never outputs that the
> query has completed. If I then try to execute other queries (whilst the
> former haven't completed), it is not possible to successfully execute those
> either. I can not find anything in the logs which describes what is
> happening, other than a warning message I've never seen before:
> [QueryTranslator] WARN
> org.apache.hyracks.control.common.config.ConfigManager - NC option [nc]
> storage.lsm.bloomfilter.falsepositiverate being accessed outside of
> NC-scoped configuration.
>
> I am thinking maybe there is a problem connecting to Twitter, but I am not
> sure. Have you experienced anything similar before?
>
> Best,
> Sandra
>


Re: Twitter adapter

2019-04-05 Thread Xikui Wang
Hi Sandra,

That's not an error but a reminder (which we probably should refactor
later) says the feed will try to use the configurations from the feed
definition. You don't really need to extend the access of your Twitter dev
account. As long as you follow the DDL template in the doc, you should be
fine. Here is the snippet from the documentation [1]:

use feeds;

create feed TwitterFeed with {
  "adapter-name": "push_twitter",
  "type-name": "Tweet",
  "format": "twitter-status",
  "consumer.key": "",
  "consumer.secret": "",
  "access.token": "**",
  "access.token.secret": "*"
};


Best,
Xikui

On Fri, Apr 5, 2019 at 8:08 AM sandraskarsh...@gmail.com <
sandraskarsh...@gmail.com> wrote:

> Hi!
>
> When using the built-in push-based Twitter adapter, which endpoint in the
> Twitter API is it actually making a request to? I have obtained a developer
> account at Twitter, but I am not sure if I need any extended access than
> what provided for the normal developer accounts in order to use the Twitter
> adapter.
>
> When trying to start the feed which uses the Twitter adapter, I find the
> following error in the cc.log, and the query never seem to finish its
> execution:
>
> [QueryTranslator] WARN  org.apache.asterix.external.util.TwitterUtil -
> unable to load authentication credentials from auth.properties
> filecredential information will be obtained from adapter's configuration
>
> Thanks in advance!
>
> Sandra
>


Re: Derived types in Java UDF?

2019-03-12 Thread Xikui Wang
I think the example that you showed should work for you. Isn't it? If it's
a primitive type, you could use the data type in BuiltinType,
e.g., BuiltinType.ASTRING. If it's derived type, then you would have to
construct an ARecord type as the way that you did in your example. As you
may have noticed, the open/close flag of the constructed type should be
consistent with your DDLs. In your case, they are both closed. If they are
not consistent, there could be serialization issues. I tried to construct a
small test case with open data type based on the UpperCaseFunction, and it
works for me well. Here is the code snippet. You could move the data type
constructions into the initialization method to avoid creating them
repeatedly.

Java UDF:
...
JRecord result = (JRecord) functionHelper.getResultObject();
result.setField("id", id);
result.setField("text", text);
String[] capFields = { "text" };
IAType[] capFieldTypes = { BuiltinType.ASTRING };
ARecordType capRecordType = new ARecordType("CapitalizedType",
capFields, capFieldTypes, *true*);
IJObject[] capFieldVals = { new JString("New field") };
JRecord capRecord = new JRecord(capRecordType, capFieldVals);
JOrderedList capitalized = new JOrderedList(capRecordType);
capitalized.add(capRecord);
result.setField("capitalized", capitalized);
functionHelper.setResult(result);
...

DDLs:
create type TextType if not exists as open {
id: int32,
text: string
};

create type CapitalizedType as *open* {
text: string
};

create type OutputTextType as open {
id: int32,
text: string,
capitalized: [CapitalizedType]
};

Best,
Xikui

On Tue, Mar 12, 2019 at 6:19 AM sandraskarsh...@gmail.com <
sandraskarsh...@gmail.com> wrote:

>
>
> On 2019/03/12 12:19:38, sandraskarsh...@gmail.com <
> sandraskarsh...@gmail.com> wrote:
> > Does this make sense in order to add TermFrequencyType objects to
> termFrequencies?
> >
> > termFrequencies.add(
> >  new JRecord(
> >  new ARecordType("TermFrequencyType", getTermFrequencyFields(),
> getFieldTypes(), false),
> >  new IJObject[]{ new JString("hello"), new JInt(1) }
> >  )
> > );
> >
> > And have I understood it correctly if I implement getFieldTypes() like
> this:
> >
> > IAType[] getFieldTypes() {
> >return new IAType[]{BuiltinType.ASTRING, BuiltinType.AINT32};
> > }
> >
> >
> >
>
> Excuse me, I of course ment which implementation of the IAType interface I
> should use :-)
>


Re: Derived types in Java UDF?

2019-03-11 Thread Xikui Wang
Hi Sandra,

You can use derived data types in UDFs, and I think you've got most of the
parts right. One thing you should do it to define the termsFrequencies in
your Java UDF code to be a JOrderedList or JUnorderedList instead of a Java
Array. This should resolve the error. I've created an example of how to do
that in a Java UDF. You can find it here [1]. When constructing this
example, I found a minor bug in reading an array of record objects, but
writing it, in your case, should be fine. This will be fixed once the patch
[2] is merged.

[1]
https://asterix-gerrit.ics.uci.edu/#/c/3264/1/asterixdb/asterix-external-data/src/test/java/org/apache/asterix/external/library/UpperCaseFunction.java
[2] https://asterix-gerrit.ics.uci.edu/#/c/3264/

Best,
Xikui

On Mon, Mar 11, 2019 at 2:20 AM sandraskarsh...@gmail.com <
sandraskarsh...@gmail.com> wrote:

> Hi,
>
> I have a Java UDF, with input being a record containing an id as well as
> some text. The UDF outputs a record of type RelevantTweetType, by setting
> three additional fields, with one of them being an array of
> TermFrequencyType objects (see type below). Calling
> result.setField("termFrequencies", x) with the second parameter x being a
> JRecord[] results in an error, of course, since it is not an instance of
> type IJObject. I was wondering if it is possible to set an array of a
> datatype (TermFrequencyType) as a record field in a Java UDF, and if so,
> how? Or are the record fields limited to being primitive types when set in
> a Java UDF?
>
> CREATE TYPE RelevantTweetType AS CLOSED {
>   id: int32,
>   text: string,
>   threadid: int32,
>   relevant: boolean,
>   termsFrequencies: [TermFrequencyType]
> };
>
> CREATE TYPE TermFrequencyType AS CLOSED {
>   term: string,
>   frequency: int32
> };
>
> Thanks in advance,
> Sandra
>


Data flushing in AbstractOneInputOneOutputOneFramePushRuntime

2019-03-07 Thread Xikui Wang
Hi Devs,

I'm working on an experiment, in which I have a socket feed receives data
at a very low rate (1 record/second). In data feeds, when data is not
coming fast enough, the flow controller will try to force a flush before it
heads into waiting. However, I found the data was not flushed as I expected
in my experiment.

After some investigation, I pinpointed a "potential suspect". In the
"flushAndReset" method of the AbstractOneInputOneOutputOneFramePushRuntime
class [1], the record appender writes data records into its writer, but
that writer is not flushed. Thus, the data records are not flushed as
expected, even when the "flushFramesRapidly" flag is on. After adding
a writer.flush() call, everything runs as expected.

This feels like a bug to me. Since this class touches many runtime
factories, and it may have impacts on the performance, I want to double
check with you guys before I submit the fix. Thoughts?

[1]
org/apache/hyracks/algebricks/runtime/operators/base/AbstractOneInputOneOutputOneFramePushRuntime.java:74

Best,
Xikui


Re: Access data from dataset in UDF

2019-02-28 Thread Xikui Wang
Hi Sandra,

Just to clarify the Java UDF implementation, you *cannot* access AsterixDB
datasets in a Java UDF. To approximate what you want to do with a Java UDF,
you can load the reference data *files* in a Java UDF and update the file
externally at a high cost of reloading it from time to time. Both options
are discussed in the paper.

Best,
Xikui

On Thu, Feb 28, 2019 at 9:07 AM Xikui Wang  wrote:

> Hi Sandra,
>
> To answer your question in short: Yes, you can use Java UDF to do that.
>
> One thing worth noticing is that, whether you use a Java UDF or a SQL++
> UDF, there can be issues in some cases, as you are accessing dataset on a
> feed pipeline and that dataset is being actively fed by the other data
> feed. I recently submitted a paper that discussed a similar problem. There
> are some examples of using SQL++ UDFs or Java UDFs on a feed pipeline in
> the paper as well. I've attached the latest draft of that paper, and it's
> on arXiv as well [1] (the latest draft is under processing). Please have a
> look and let me know whether that helps.
>
> [1] https://arxiv.org/abs/1902.08271
>
> Best,
> Xikui
>
> On Thu, Feb 28, 2019 at 12:54 AM sandraskarsh...@gmail.com <
> sandraskarsh...@gmail.com> wrote:
>
>> Hi!
>>
>> I am trying to understand how to access data stored in a dataset, say the
>> dataset "UserQueries", from a UDF. Say the intent of the given UDF is
>> similar to the "WordsInList" UDF created here:
>> https://github.com/idleft/asterix-udf-template/blob/master/src/main/java/org/apache/asterix/external/library/WordInListFunction.java
>>
>> The possible pipeline of the system would look like this:
>> A socket feed is created and started, which listens to incoming data of
>> the type "UserQuery". I’ve created a user interface which will send data to
>> the specific socket in ADM format. This data is stored in the dataset
>> "UserQueries". Then, I wish to access the data in a given record within
>> "UserQueries" to find the keywords to use in the WordInList UDF. This
>> function/UDF is then going to be used as a query predicate to filter the
>> incoming data.
>>
>> Must the UDF be written in SQL++ format in order to achieve this, or is
>> it possible to write it in Java? The “Data Ingestion in AsterixDB” article
>> specifies that the former format is a good fit when the pre-processing of a
>> record requires the result of a query, and I can’t find any documentation
>> doing this with a Java UDF.
>>
>> If the UDF must be written in SQL++ in order to accomplish this, I am
>> thinking something like this:
>>
>> create function GetUserQueryKeywords(userId) {
>> (select q.keywords from UserQueries q
>>where q.userid = userid
>>and q.timestamp > current_datetime() - daytime_duration(“PT10”))
>> };
>>
>> Could you maybe point me in the right direction of how to use such query
>> results as input for a UDF like  WordInList, if possible?
>>
>> Thanks in advance.
>>
>> Best regards,
>> Sandra
>>
>>


Re: Access data from dataset in UDF

2019-02-28 Thread Xikui Wang
Hi Sandra,

To answer your question in short: Yes, you can use Java UDF to do that.

One thing worth noticing is that, whether you use a Java UDF or a SQL++
UDF, there can be issues in some cases, as you are accessing dataset on a
feed pipeline and that dataset is being actively fed by the other data
feed. I recently submitted a paper that discussed a similar problem. There
are some examples of using SQL++ UDFs or Java UDFs on a feed pipeline in
the paper as well. I've attached the latest draft of that paper, and it's
on arXiv as well [1] (the latest draft is under processing). Please have a
look and let me know whether that helps.

[1] https://arxiv.org/abs/1902.08271

Best,
Xikui

On Thu, Feb 28, 2019 at 12:54 AM sandraskarsh...@gmail.com <
sandraskarsh...@gmail.com> wrote:

> Hi!
>
> I am trying to understand how to access data stored in a dataset, say the
> dataset "UserQueries", from a UDF. Say the intent of the given UDF is
> similar to the "WordsInList" UDF created here:
> https://github.com/idleft/asterix-udf-template/blob/master/src/main/java/org/apache/asterix/external/library/WordInListFunction.java
>
> The possible pipeline of the system would look like this:
> A socket feed is created and started, which listens to incoming data of
> the type "UserQuery". I’ve created a user interface which will send data to
> the specific socket in ADM format. This data is stored in the dataset
> "UserQueries". Then, I wish to access the data in a given record within
> "UserQueries" to find the keywords to use in the WordInList UDF. This
> function/UDF is then going to be used as a query predicate to filter the
> incoming data.
>
> Must the UDF be written in SQL++ format in order to achieve this, or is it
> possible to write it in Java? The “Data Ingestion in AsterixDB” article
> specifies that the former format is a good fit when the pre-processing of a
> record requires the result of a query, and I can’t find any documentation
> doing this with a Java UDF.
>
> If the UDF must be written in SQL++ in order to accomplish this, I am
> thinking something like this:
>
> create function GetUserQueryKeywords(userId) {
> (select q.keywords from UserQueries q
>where q.userid = userid
>and q.timestamp > current_datetime() - daytime_duration(“PT10”))
> };
>
> Could you maybe point me in the right direction of how to use such query
> results as input for a UDF like  WordInList, if possible?
>
> Thanks in advance.
>
> Best regards,
> Sandra
>
>


Re: [VOTE] Release Apache AsterixDB 0.9.4.1 and Hyracks 0.3.4.1 (RC2)

2019-02-21 Thread Xikui Wang
[X] +1 release these packages as Apache AsterixDB 0.9.4.1 and
Apache Hyracks 0.3.4.1

- Verified the sha256
- Tested Twitter feed with drop-in dependencies

On Sat, Feb 16, 2019 at 12:23 PM Mike Carey  wrote:

> [X] +1 release these packages as Apache AsterixDB 0.9.4.1 and Apache
> Hyracks 0.3.4.1
>
> (I downloaded and verified the NCService puzzle piece and it worked like a
> charm.)
>
> On 2/15/19 12:03 PM, Ian Maxon wrote:
> > Hi everyone,
> >
> > Please verify and vote on the latest release of Apache AsterixDB
> >
> > The change that produced this release and the change to advance the
> version
> > are
> > up for review on Gerrit:
> >
> >
> https://asterix-gerrit.ics.uci.edu/#/q/status:open+owner:%22Jenkins+%253Cjenkins%2540fulliautomatix.ics.uci.edu%253E%22
> >
> > The release artifacts are as follows:
> >
> > AsterixDB Source
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.4.1-source-release.zip
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.4.1-source-release.zip.asc
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.4.1-source-release.zip.sha256
> >
> > SHA1:8bdb79294f20ff0140ea46b4a6acf5b787ac1ff3423ec41d5c5c8cdec275000c
> >
> > Hyracks Source
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyracks-0.3.4.1-source-release.zip
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyracks-0.3.4.1-source-release.zip.asc
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyracks-0.3.4.1-source-release.zip.sha256
> >
> > SHA1:163a879031a270b0a1d5202247d478c7788ac0a5c704c7fb87d515337df54610
> >
> > AsterixDB NCService Installer:
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-server-0.9.4.1-binary-assembly.zip
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-server-0.9.4.1-binary-assembly.zip.asc
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-server-0.9.4.1-binary-assembly.zip.sha256
> >
> > SHA1:a3961f32aed8283af3cd7b66309770a5cabff426020c9c4a5b699273ad1fa820
> >
> > The KEYS file containing the PGP keys used to sign the release can be
> > found at
> >
> > https://dist.apache.org/repos/dist/release/asterixdb/KEYS
> >
> > RAT was executed as part of Maven via the RAT maven plugin, but
> > excludes files that are:
> >
> > - data for tests
> > - procedurally generated,
> > - or source files which come without a header mentioning their license,
> >but have an explicit reference in the LICENSE file.
> >
> >
> > The vote is open for 72 hours, or until the necessary number of votes
> > (3 +1) has been reached.
> >
> > Please vote
> > [ ] +1 release these packages as Apache AsterixDB 0.9.4.1 and
> > Apache Hyracks 0.3.4.1
> > [ ] 0 No strong feeling either way
> > [ ] -1 do not release one or both packages because ...
> >
> > Thanks!
> >
>


Re: Follow Up after the Research Meeting

2019-02-08 Thread Xikui Wang
Hi Shiyu,

A good way to start contributing to AsterixDB is to get a copy of
the codebase and become familiar with it. Here is an instruction that you
can follow to set up your development environment [1]. If you are looking
for things to kickstart, you can check out our JIRA [2][3]. It has the bug
reports and improvement requests submitted by our users. I suggest you
start with improvements or minor bugs first. This could quickly guide you
through the codebase. I hope this can help you find the things that
interest you along the way as well.

[1] http://asterixdb.apache.org/dev-setup.html
[2] Improvement requests:
https://issues.apache.org/jira/browse/ASTERIXDB-2518?jql=project%20%3D%20ASTERIXDB%20AND%20resolution%20%3D%20Unresolved%20AND%20issuetype%20%3D%20improvement%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC

[3] Minor bug reports:
https://issues.apache.org/jira/browse/ASTERIXDB-2495?jql=project%20%3D%20ASTERIXDB%20AND%20issuetype%20%3D%20Bug%20%20AND%20resolution%20%3D%20Unresolved%20and%20priority%20%3D%20Minor%20%20order%20BY%20updated%20DESC

Best,
Xikui

On Thu, Feb 7, 2019 at 11:29 PM Shiyu Qiu  wrote:

> Hi,
> I am Shiyu Qiu, and  I attended the ASTERIX project research meeting at
> UCI last Friday. I am a third-year Computer Science undergraduate student.
> I am interested in participating AsterixDB program. If any of you need an
> assistant for your work, please email me. Thank you so much!
>
> Best,
> Shiyu
>


Re: error message during load phase

2019-01-06 Thread Xikui Wang
Have you tried to rebuild the project with a clean start? I tried it with
the latest master. It works for me. Try to do "mvn clean install
-DskipTests" in the asterixdb folder.

Best,
Xikui

On Sun, Jan 6, 2019 at 7:05 PM Christina Pavlopoulou 
wrote:

> Hello everyone,
>
> I had recently changed to the latest master branch and I tried to load a
> sample dataset. However, I get a classNotFoundException for
> TwitterRecordReaderFactory. I, also, tried to go back to the version I knew
> it was working and now I get the same error message. Here is the complete
> error message I get:
>
> org.apache.hyracks.algebricks.common.exceptions.AlgebricksException: Unable
> to create adapter
> at
>
> org.apache.asterix.metadata.declared.MetadataProvider.getConfiguredAdapterFactory(MetadataProvider.java:797)
> ~[classes/:?]
> at
>
> org.apache.asterix.metadata.declared.LoadableDataSource.buildDatasourceScanRuntime(LoadableDataSource.java:132)
> ~[classes/:?]
> at
>
> org.apache.asterix.metadata.declared.MetadataProvider.getScannerRuntime(MetadataProvider.java:409)
> ~[classes/:?]
> at
>
> org.apache.hyracks.algebricks.core.algebra.operators.physical.DataSourceScanPOperator.contributeRuntimeOperator(DataSourceScanPOperator.java:113)
> ~[classes/:?]
> at
>
> org.apache.hyracks.algebricks.core.algebra.operators.logical.AbstractLogicalOperator.contributeRuntimeOperator(AbstractLogicalOperator.java:175)
> ~[classes/:?]
> at
>
> org.apache.hyracks.algebricks.core.jobgen.impl.PlanCompiler.compileOpRef(PlanCompiler.java:110)
> ~[classes/:?]
> at
>
> org.apache.hyracks.algebricks.core.jobgen.impl.PlanCompiler.compileOpRef(PlanCompiler.java:97)
> ~[classes/:?]
> at
>
> org.apache.hyracks.algebricks.core.jobgen.impl.PlanCompiler.compileOpRef(PlanCompiler.java:97)
> ~[classes/:?]
> at
>
> org.apache.hyracks.algebricks.core.jobgen.impl.PlanCompiler.compileOpRef(PlanCompiler.java:97)
> ~[classes/:?]
> at
>
> org.apache.hyracks.algebricks.core.jobgen.impl.PlanCompiler.compileOpRef(PlanCompiler.java:97)
> ~[classes/:?]
> at
>
> org.apache.hyracks.algebricks.core.jobgen.impl.PlanCompiler.compileOpRef(PlanCompiler.java:97)
> ~[classes/:?]
> at
>
> org.apache.hyracks.algebricks.core.jobgen.impl.PlanCompiler.compileOpRef(PlanCompiler.java:97)
> ~[classes/:?]
> at
>
> org.apache.hyracks.algebricks.core.jobgen.impl.PlanCompiler.compileOpRef(PlanCompiler.java:97)
> ~[classes/:?]
> at
>
> org.apache.hyracks.algebricks.core.jobgen.impl.PlanCompiler.compileOpRef(PlanCompiler.java:97)
> ~[classes/:?]
> at
>
> org.apache.hyracks.algebricks.core.jobgen.impl.PlanCompiler.compileOpRef(PlanCompiler.java:97)
> ~[classes/:?]
> at
>
> org.apache.hyracks.algebricks.core.jobgen.impl.PlanCompiler.compileOpRef(PlanCompiler.java:97)
> ~[classes/:?]
> at
>
> org.apache.hyracks.algebricks.core.jobgen.impl.PlanCompiler.compileOpRef(PlanCompiler.java:97)
> ~[classes/:?]
> at
>
> org.apache.hyracks.algebricks.core.jobgen.impl.PlanCompiler.compileOpRef(PlanCompiler.java:97)
> ~[classes/:?]
> at
>
> org.apache.hyracks.algebricks.core.jobgen.impl.PlanCompiler.compileOpRef(PlanCompiler.java:97)
> ~[classes/:?]
> at
>
> org.apache.hyracks.algebricks.core.jobgen.impl.PlanCompiler.compileOpRef(PlanCompiler.java:97)
> ~[classes/:?]
> at
>
> org.apache.hyracks.algebricks.core.jobgen.impl.PlanCompiler.compileOpRef(PlanCompiler.java:97)
> ~[classes/:?]
> at
>
> org.apache.hyracks.algebricks.core.jobgen.impl.PlanCompiler.compileOpRef(PlanCompiler.java:97)
> ~[classes/:?]
> at
>
> org.apache.hyracks.algebricks.core.jobgen.impl.PlanCompiler.compilePlanImpl(PlanCompiler.java:70)
> ~[classes/:?]
> at
>
> org.apache.hyracks.algebricks.core.jobgen.impl.PlanCompiler.compilePlan(PlanCompiler.java:53)
> ~[classes/:?]
> at
>
> org.apache.hyracks.algebricks.compiler.api.HeuristicCompilerFactoryBuilder$1$1.createJob(HeuristicCompilerFactoryBuilder.java:110)
> ~[classes/:?]
> at
>
> org.apache.asterix.api.common.APIFramework.compileQuery(APIFramework.java:293)
> ~[classes/:?]
> at
>
> org.apache.asterix.app.translator.QueryTranslator.handleLoadStatement(QueryTranslator.java:1778)
> ~[classes/:?]
> at
>
> org.apache.asterix.app.translator.QueryTranslator.compileAndExecute(QueryTranslator.java:338)
> ~[classes/:?]
> at org.apache.asterix.api.http.server.ApiServlet.post(ApiServlet.java:168)
> [classes/:?]
> at
>
> org.apache.hyracks.http.server.AbstractServlet.handle(AbstractServlet.java:92)
> [classes/:?]
> at
>
> org.apache.hyracks.http.server.HttpRequestHandler.handle(HttpRequestHandler.java:71)
> [classes/:?]
> at
>
> org.apache.hyracks.http.server.HttpRequestHandler.call(HttpRequestHandler.java:56)
> [classes/:?]
> at
>
> org.apache.hyracks.http.server.HttpRequestHandler.call(HttpRequestHandler.java:1)
> [classes/:?]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_73]
> at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [?:1.8.0_73]
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 

Re: Filter incoming data by query predicates

2018-12-06 Thread Xikui Wang
Hi Sandra,

Yes. It will store the entire record. Note that the applying function to a
feed is different from adding a filter to a feed. To help you understand
their difference better, here is an example.

Imagine the data feed as a big dataset called FeedDataset, and you want to
store the ingested data into the TargetDataset. An equivalent statement
that moves data from the feed to the target dataset looks like this:

insert into TargetDataset(select value f from FeedDataset f);

If you apply a function called "testlib#process_func" on to the feed, the
equivalent statement is like this:

insert into TargetDataset(select value testlib#process_func(f) from
FeedDataset f);

If you have a filter function called "testlib#filter_func", and you add it
to the feed using the WHERE clause, the equivalent statement becomes this:

insert into TargetDataset(select value testlib#process_func(f) from
FeedDataset f where testlib#filter_func(f) == TRUE);

Thus, the filter function and the applied function are two things and they
are orthogonal. In the last example, some incoming data are filtered out by
the function (filter_func) in the where clause, and the remained incoming
data will still be processed by the applied function (process_func). You
can use either one that fits your needs. :)

Best,
Xikui

On Thu, Dec 6, 2018 at 8:21 AM sandraskarsh...@gmail.com <
sandraskarsh...@gmail.com> wrote:

> Hi again!
>
> I am currently trying to make use of the filtering by query predicate
> example which was discussed in another thread here ("Build UDF project"),
> see below:
>
> *connect feed UserFeed to dataset EmpDataset WHERE
> testlib#wordDetector(fname) = TRUE;*
> start feed UserFeed;
>
> . using the wordDetector UDF found here:
> https://github.com/idleft/asterix-udf-template/blob/master/src/main/java/org/apache/asterix/external/library/WordInListFunction.java
>
> However, the output type of this UDF, as defined in library_descriptor.xml
> is "ABOOLEAN". Will it still store the entire record (InputRecordType) in
> the EmpDataset, or only the boolean value? And, if I would like to use the
> records which pass the filtering in wordDetector as input to another UDF,
> would I need to change the output type of the UDF? If so, the check
> "testlib#wordDetector(fname) = TRUE;*" will not work anymore, due to the
> output being an entire record instead of only a boolean.
>
> I appreciate your help!
>
> Best regards,
> Sandra
>
>
>


Re: Build UDF project (Maven) with large model to deploy to AsterixDB [Error assembling jar]

2018-11-28 Thread Xikui Wang
Yes. You can delete the lib folder since these dependencies are picked up
by AsterixDB from repo/. Having dependency jars in one of the two places
should be sufficient. :)

Best,
Xikui

On Wed, Nov 28, 2018 at 10:16 AM sandraskarsh...@gmail.com <
sandraskarsh...@gmail.com> wrote:

> Hi Xikui,
>
> So when deploying my UDF to AsterixDB, I've put the content of the
> unzipped testlib folder into this folder:
> apache-asterixdb-0.9.5-SNAPSHOT/lib/udfs/feeds/testlib/
>
> The resulting testlib content then looks like this:
> - library_descriptor.xml
> - asterix-udf-template-0.1-SNAPSHOT.jar
> - lib (folder with external dependencies)
>
> However, since the dependencies from this /lib folder ought to be copied
> into apache-asterixdb-0.9.5-SNAPSHOT/repo instead, should I delete the
> apache-asterixdb-0.9.5-SNAPSHOT/lib/udfs/feeds/testlib/lib folder which are
> created when dropping the unzipped UDF package inside testlib, or keep the
> dependencies both there and in /repo?
>
> Thanks!
>
>
> On 2018/11/28 17:03:31, Xikui Wang  wrote:
> > Hi Sandra,
> >
> > If you are following the binary-assembly-libzip.xml that you showed to me
> > earlier, the specified dependency jars should be under the lib directory
> in
> > your compiled UDF package, i.e., "- lib (dictionary containing .jars for
> my
> > dependencies listed above in binary-assembly-libzip.xml)". You can copy
> all
> > the jar files in this directory to the repo directory in AsterixDB. That
> > would work. As for the repacking part, that was for those who want to
> > distribute their patched AsterixDB to their users. In your case, you can
> > ignore that.
> >
> > Best,
> > Xikui
> >
> > On Wed, Nov 28, 2018 at 1:51 AM sandraskarsh...@gmail.com <
> > sandraskarsh...@gmail.com> wrote:
> >
> > > Hi, thanks again Xikui!
> > >
> > > I am trying the latter option now – dropping the dependency jars into
> the
> > > /repo folder. Does it have anything to say where I copy the dependency
> jars
> > > from?
> > >
> > > In addition, I think I should provide some context of my locally run
> > > instance of AsterixDB:
> > > - I have cloned the asterixdb repo from github, so I have it local on
> my
> > > Macbook Pro.
> > > - Inside the cloned folder,
> > >
> asterixdb/asterixdb/asterix-server/target/asterix-server-0.9.5-SNAPSHOT-binary-assembly
> > > folder, there lies a folder called apache-asterixdb-0.9.5-SNAPSHOT,
> which
> > > in turn contains the folders bin, etc, lib, opt and repo.
> > > - It is inside _this_ repo folder I am putting the dependency jars.
> > > - It is from this /opt/local/bin folder I am running sh
> > > start-sample-cluster.sh
> > >
> > > So, when following the option 2 example provided in your link [1], it
> says
> > > to repach this folder into a zip again. I don't quite get this, as
> this is
> > > the folder I am using to run AsterixDB?
> > >
> > > Thanks in advance!
> > >
> > > Best regards,
> > > Sandra
> > >
> > >
> > > On 2018/11/27 16:38:23, Xikui Wang  wrote:
> > > > The configuration seems alright, but it's very hard to say where the
> > > > problem is since I haven't had the chance to see what is exactly in
> your
> > > > lib directory. If this packaging doesn't work for you, you can try to
> > > pack
> > > > the dependencies into the UDF jar as a single fat jar, or you can
> drop
> > > the
> > > > dependency jars into the "asterix-server-0.9.*-binary-assembly/repo"
> > > directory,
> > > > so they can be distributed with the AsterixDB instance. I would
> recommend
> > > > the latter method, as you don't have to redeploy the dependency jars
> > > every
> > > > time when a UDF changes. These two methods are described in the
> > > > documentation of the UDF template repo [1]. :)
> > > >
> > > > [1] https://github.com/idleft/asterix-udf-template
> > > >
> > > > Best,
> > > > Xikui
> > > >
> > > > On Tue, Nov 27, 2018 at 6:04 AM sandraskarsh...@gmail.com <
> > > > sandraskarsh...@gmail.com> wrote:
> > > >
> > > > > Thank you for making sense of the log file for me, I managed to
> get the
> > > > > parameters work!
> > > > >
> > > > > However, a new challenge became evident, of course. The new error
&g

Re: Build UDF project (Maven) with large model to deploy to AsterixDB [Error assembling jar]

2018-11-28 Thread Xikui Wang
Hi Sandra,

If you are following the binary-assembly-libzip.xml that you showed to me
earlier, the specified dependency jars should be under the lib directory in
your compiled UDF package, i.e., "- lib (dictionary containing .jars for my
dependencies listed above in binary-assembly-libzip.xml)". You can copy all
the jar files in this directory to the repo directory in AsterixDB. That
would work. As for the repacking part, that was for those who want to
distribute their patched AsterixDB to their users. In your case, you can
ignore that.

Best,
Xikui

On Wed, Nov 28, 2018 at 1:51 AM sandraskarsh...@gmail.com <
sandraskarsh...@gmail.com> wrote:

> Hi, thanks again Xikui!
>
> I am trying the latter option now – dropping the dependency jars into the
> /repo folder. Does it have anything to say where I copy the dependency jars
> from?
>
> In addition, I think I should provide some context of my locally run
> instance of AsterixDB:
> - I have cloned the asterixdb repo from github, so I have it local on my
> Macbook Pro.
> - Inside the cloned folder,
> asterixdb/asterixdb/asterix-server/target/asterix-server-0.9.5-SNAPSHOT-binary-assembly
> folder, there lies a folder called apache-asterixdb-0.9.5-SNAPSHOT, which
> in turn contains the folders bin, etc, lib, opt and repo.
> - It is inside _this_ repo folder I am putting the dependency jars.
> - It is from this /opt/local/bin folder I am running sh
> start-sample-cluster.sh
>
> So, when following the option 2 example provided in your link [1], it says
> to repach this folder into a zip again. I don't quite get this, as this is
> the folder I am using to run AsterixDB?
>
> Thanks in advance!
>
> Best regards,
> Sandra
>
>
> On 2018/11/27 16:38:23, Xikui Wang  wrote:
> > The configuration seems alright, but it's very hard to say where the
> > problem is since I haven't had the chance to see what is exactly in your
> > lib directory. If this packaging doesn't work for you, you can try to
> pack
> > the dependencies into the UDF jar as a single fat jar, or you can drop
> the
> > dependency jars into the "asterix-server-0.9.*-binary-assembly/repo"
> directory,
> > so they can be distributed with the AsterixDB instance. I would recommend
> > the latter method, as you don't have to redeploy the dependency jars
> every
> > time when a UDF changes. These two methods are described in the
> > documentation of the UDF template repo [1]. :)
> >
> > [1] https://github.com/idleft/asterix-udf-template
> >
> > Best,
> > Xikui
> >
> > On Tue, Nov 27, 2018 at 6:04 AM sandraskarsh...@gmail.com <
> > sandraskarsh...@gmail.com> wrote:
> >
> > > Thank you for making sense of the log file for me, I managed to get the
> > > parameters work!
> > >
> > > However, a new challenge became evident, of course. The new error that
> I
> > > am seeing (java.lang.ClassNotFoundException in the cc.log when trying
> to
> > > use one of the dependencies in my code). I think this may be happening
> due
> > > to the external dependency, and if it is reachable or not from my UDF
> when
> > > running locally on AsterixDB. Could you explain if my approach for
> > > including external dependencies are right or not (approach/steps listed
> > > below)?
> > >
> > > 1. The binary-assembly-libzip.xml looks like this, where the
> dependencies
> > > are included at the bottom:
> > >
> > > 
> > >   testlib
> > >   
> > > zip
> > >   
> > >   false
> > >   
> > > 
> > >   target
> > >   
> > >   
> > > *.jar
> > >   
> > > 
> > > 
> > >   src/main/resources
> > >   
> > >   
> > > library_descriptor.xml
> > >   
> > > 
> > >   
> > >   
> > > 
> > >   
> > > commons-io:commons-io
> > > ch.qos.logback:logback-core
> > > org.slf4j:slf4j-api
> > > ch.qos.logback:logback-classic
> > > org.deeplearning4j:deeplearning4j-core
> > >
>  org.deeplearning4j:deeplearning4j-modelimport
> > > org.deeplearning4j:deeplearning4j-nlp
> > > org.nd4j:nd4j-api
> > > org.nd4j:nd4j-native
> > >   
> > >   false
> > >   lib
> > > 
> > >   
> > > 
> > >
> > > 2. When the Maven project is built (mvn clean install), it generates
> files

Re: Build UDF project (Maven) with large model to deploy to AsterixDB [Error assembling jar]

2018-11-27 Thread Xikui Wang
The configuration seems alright, but it's very hard to say where the
problem is since I haven't had the chance to see what is exactly in your
lib directory. If this packaging doesn't work for you, you can try to pack
the dependencies into the UDF jar as a single fat jar, or you can drop the
dependency jars into the "asterix-server-0.9.*-binary-assembly/repo" directory,
so they can be distributed with the AsterixDB instance. I would recommend
the latter method, as you don't have to redeploy the dependency jars every
time when a UDF changes. These two methods are described in the
documentation of the UDF template repo [1]. :)

[1] https://github.com/idleft/asterix-udf-template

Best,
Xikui

On Tue, Nov 27, 2018 at 6:04 AM sandraskarsh...@gmail.com <
sandraskarsh...@gmail.com> wrote:

> Thank you for making sense of the log file for me, I managed to get the
> parameters work!
>
> However, a new challenge became evident, of course. The new error that I
> am seeing (java.lang.ClassNotFoundException in the cc.log when trying to
> use one of the dependencies in my code). I think this may be happening due
> to the external dependency, and if it is reachable or not from my UDF when
> running locally on AsterixDB. Could you explain if my approach for
> including external dependencies are right or not (approach/steps listed
> below)?
>
> 1. The binary-assembly-libzip.xml looks like this, where the dependencies
> are included at the bottom:
>
> 
>   testlib
>   
> zip
>   
>   false
>   
> 
>   target
>   
>   
> *.jar
>   
> 
> 
>   src/main/resources
>   
>   
> library_descriptor.xml
>   
> 
>   
>   
> 
>   
> commons-io:commons-io
> ch.qos.logback:logback-core
> org.slf4j:slf4j-api
> ch.qos.logback:logback-classic
> org.deeplearning4j:deeplearning4j-core
> org.deeplearning4j:deeplearning4j-modelimport
> org.deeplearning4j:deeplearning4j-nlp
> org.nd4j:nd4j-api
> org.nd4j:nd4j-native
>   
>   false
>   lib
> 
>   
> 
>
> 2. When the Maven project is built (mvn clean install), it generates files
> in /target:
> - asterix-udf-template-0.1-SNAPSHOT-testlib.zip
> - asterix-udf-template-0.1-SNAPSHOT.jar
> - archive-tmp
> - classes
> - generated-sources
> - maven-archiver
> - maven-status
>
> 3. When unzipping the uppermost file (testlib), it contains:
> - lib (dictionary containing .jars for my dependencies listed above in
> binary-assembly-libzip.xml)
> - library_descriptor.xml
> - asterix-udf-template-0.1-SNAPSHOT.jar
>
> 4. And when unzipping the bottommost .jar inside the testlib here, it
> contains:
> - my model (model.bin.gz)
> - library_descriptor.xml
> - META-INF
> - org.apache.asterix.external
> > contains my classes
>
> Does this look right?
>
> I appreciate your help!
>
> Best regards,
> Sandra
>
> On 2018/11/27 06:38:58, Xikui Wang  wrote:
> > Hi Sandra,
> >
> > Based on the log, it seems you have an IndexOutOfBoundsException in your
> > UDF code. Can you double check your UDF at
> >
> org.apache.asterix.external.library.RelevanceDetecterFunction.initialize(RelevanceDetecterFunction.java:33)
> > and your UDF configuration file? You will have to make sure the
> parameters
> > are specified properly in the config file, and they are properly accessed
> > in the initialize method.
> >
> > Best,
> > Xikui
> >
> > On Mon, Nov 26, 2018 at 1:33 PM sandraskarsh...@gmail.com <
> > sandraskarsh...@gmail.com> wrote:
> >
> > > Hi Xikui!
> > >
> > > So I tried to add the resource as a parameter. However, I get this
> error
> > > (gist with log from cc.log) [1] when the query below is executed:
> > >
> > > USE feeds;
> > > CONNECT FEED TestSocketFeed TO DATASET RelevantDataset
> > > APPLY function testlib#detectRelevance; start feed TestSocketFeed
> > >
> > > To provide some context, this query works as it should when I don't
> > > include the model.
> > >
> > > [1]
> https://gist.github.com/sandraskars/3f707d9b07e5b6c1006368a297b6eacb
> > >
> > > Best regards,
> > > Sandra
> > >
> > >
> > >
> > > On 2018/11/26 05:45:03, Xikui Wang  wrote:
> > > > Hi Sandra,
> > > >
> > > > Here is an example for adding parameters to a UDF [1]. As you can
> see,
> > > the
> > > > function "KeywordsDetectorFactory" r

Re: Build UDF project (Maven) with large model to deploy to AsterixDB [Error assembling jar]

2018-11-26 Thread Xikui Wang
Hi Sandra,

Based on the log, it seems you have an IndexOutOfBoundsException in your
UDF code. Can you double check your UDF at
org.apache.asterix.external.library.RelevanceDetecterFunction.initialize(RelevanceDetecterFunction.java:33)
and your UDF configuration file? You will have to make sure the parameters
are specified properly in the config file, and they are properly accessed
in the initialize method.

Best,
Xikui

On Mon, Nov 26, 2018 at 1:33 PM sandraskarsh...@gmail.com <
sandraskarsh...@gmail.com> wrote:

> Hi Xikui!
>
> So I tried to add the resource as a parameter. However, I get this error
> (gist with log from cc.log) [1] when the query below is executed:
>
> USE feeds;
> CONNECT FEED TestSocketFeed TO DATASET RelevantDataset
> APPLY function testlib#detectRelevance; start feed TestSocketFeed
>
> To provide some context, this query works as it should when I don't
> include the model.
>
> [1] https://gist.github.com/sandraskars/3f707d9b07e5b6c1006368a297b6eacb
>
> Best regards,
> Sandra
>
>
>
> On 2018/11/26 05:45:03, Xikui Wang  wrote:
> > Hi Sandra,
> >
> > Here is an example for adding parameters to a UDF [1]. As you can see,
> the
> > function "KeywordsDetectorFactory" reads a given list path from a UDF
> > parameter. You can use this to reuse a Java function with different
> > resource files. This function is contained in the AsterixDB release as
> > well. Please make sure the path to the resource file is correct when you
> > use it. That's a tricky part that I always make mistakes.
> >
> > The initialize(), i.e. the model loading, is executed when the "start
> feed"
> > statement is executed. This doesn't require Tweets to come. Is that the
> > case you are referring to?
> >
> > As for your use case, here is an interesting thing that you can try.
> There
> > is a feature in the data feeds which is currently not in our
> documentation,
> > which is to allow you to filter out incoming data by query predicates. If
> > you want to filter out Tweets with the model file that you trained, you
> can
> > attach a Java UDF on your ingestion pipeline with the following query:
> >
> > use test;
> > create type InputRecordType as closed {
> > id:int64,
> > fname:string,
> > lname:string,
> > age:int64,
> > dept:string
> > };
> > create dataset EmpDataset(InputRecordType) primary key id;
> > create feed UserFeed with {
> > "adapter-name" : "socket_adapter",
> > "sockets" : "127.0.0.1:10001",
> > "address-type" : "IP",
> > "type-name" : "InputRecordType",
> > "format" : "delimited-text",
> > "delimiter" : "|",
> > "upsert-feed" : "true"
> > };
> > *connect feed UserFeed to dataset EmpDataset WHERE
> > testlib#wordDetector(fname) = TRUE;*
> > start feed UserFeed;
> >
> > The Java UDF used here is in [2]. This can help you filter out unwanted
> > incoming data on the pipeline. :)
> >
> > [1]
> >
> https://github.com/idleft/asterix-udf-template/blob/master/src/main/resources/library_descriptor.xml
> >
> > [2]
> >
> https://github.com/idleft/asterix-udf-template/blob/master/src/main/java/org/apache/asterix/external/library/WordInListFunction.java
> >
> > Best,
> > Xikui
> >
> > On Sun, Nov 25, 2018 at 1:05 PM sandraskarsh...@gmail.com <
> > sandraskarsh...@gmail.com> wrote:
> >
> > > Hi Xikui,
> > >
> > > Thanks for your response!
> > > We managed to cope with the problem by using the compressed version of
> the
> > > model instead, but it is still 1.6 GB. However, the project is able to
> > > build now :-) Yes, this is being packed into the UDF jar at the
> moment.  Do
> > > you have any examples that illustrates how to use the resource file
> path as
> > > a UDF parameter? That would be very helpful!
> > >
> > > In addition, I believe that the model loading – which is now being
> > > executed during initialize() – restrains the incoming tweets of being
> > > processed. This is evident because none of the streaming elements are
> > > stored in AsterixDB when the model loading is included in the code,
> whilst
> > > the elements are stored when I exclude the model loading from the
> code. Is
> > > it possible to make the model load, i.e making initialize() run, prior
> the
> > > arrival of the tweets at the socketfeed?
> > >
&

Re: Build UDF project (Maven) with large model to deploy to AsterixDB [Error assembling jar]

2018-11-25 Thread Xikui Wang
Hi Sandra,

Here is an example for adding parameters to a UDF [1]. As you can see, the
function "KeywordsDetectorFactory" reads a given list path from a UDF
parameter. You can use this to reuse a Java function with different
resource files. This function is contained in the AsterixDB release as
well. Please make sure the path to the resource file is correct when you
use it. That's a tricky part that I always make mistakes.

The initialize(), i.e. the model loading, is executed when the "start feed"
statement is executed. This doesn't require Tweets to come. Is that the
case you are referring to?

As for your use case, here is an interesting thing that you can try. There
is a feature in the data feeds which is currently not in our documentation,
which is to allow you to filter out incoming data by query predicates. If
you want to filter out Tweets with the model file that you trained, you can
attach a Java UDF on your ingestion pipeline with the following query:

use test;
create type InputRecordType as closed {
id:int64,
fname:string,
lname:string,
age:int64,
dept:string
};
create dataset EmpDataset(InputRecordType) primary key id;
create feed UserFeed with {
"adapter-name" : "socket_adapter",
"sockets" : "127.0.0.1:10001",
"address-type" : "IP",
"type-name" : "InputRecordType",
"format" : "delimited-text",
"delimiter" : "|",
"upsert-feed" : "true"
};
*connect feed UserFeed to dataset EmpDataset WHERE
testlib#wordDetector(fname) = TRUE;*
start feed UserFeed;

The Java UDF used here is in [2]. This can help you filter out unwanted
incoming data on the pipeline. :)

[1]
https://github.com/idleft/asterix-udf-template/blob/master/src/main/resources/library_descriptor.xml

[2]
https://github.com/idleft/asterix-udf-template/blob/master/src/main/java/org/apache/asterix/external/library/WordInListFunction.java

Best,
Xikui

On Sun, Nov 25, 2018 at 1:05 PM sandraskarsh...@gmail.com <
sandraskarsh...@gmail.com> wrote:

> Hi Xikui,
>
> Thanks for your response!
> We managed to cope with the problem by using the compressed version of the
> model instead, but it is still 1.6 GB. However, the project is able to
> build now :-) Yes, this is being packed into the UDF jar at the moment.  Do
> you have any examples that illustrates how to use the resource file path as
> a UDF parameter? That would be very helpful!
>
> In addition, I believe that the model loading – which is now being
> executed during initialize() – restrains the incoming tweets of being
> processed. This is evident because none of the streaming elements are
> stored in AsterixDB when the model loading is included in the code, whilst
> the elements are stored when I exclude the model loading from the code. Is
> it possible to make the model load, i.e making initialize() run, prior the
> arrival of the tweets at the socketfeed?
>
> Regarding our project, we are trying to detect tweets which are relevant
> for a given "user query", where the goal is crisis detection. So we are
> trying to filter out (i.e _not_ store or keep in the pipeline) tweets which
> do not contain the relevant location etc. The model I've talked about is
> being used for word embeddings (word2vec) :-)
>
> Best regards,
> Sandra Skarshaug
>
>
> On 2018/11/24 17:55:27, Xikui Wang  wrote:
> > Hi Sandra,
> >
> > How big is the model file that you are using? I guess you are trying to
> > pack this model file into the UDF jar? I personally haven't seen this
> error
> > before. It feels like a Maven building with big files issue. I found this
> > thread on StackOverflow which describes the similar situation. Could you
> > try the resolutions there?
> >
> > As a side note, if you need to use a big model file in UDF, I wouldn't
> > suggest you pack that into your UDF jar file. It's because this will
> > significantly slow down your UDF installation, and you will spend a lot
> of
> > time redeploying the resource file to the cluster if you only need to
> > update the UDF code. Alternatively, you could make the resource file path
> > as a UDF parameter, and let the UDF load that file when it initializes.
> > This could make the installation much faster and avoid deploying the
> > resource file multiple times, and the packing issue should be gone as
> well.
> > :)
> >
> > PS If it's ok, could you tell us which use case that you are working on?
> We
> > would like to know how our customers use AsterixDB in different
> scenarios,
> > so we can help them (you) better!
> >
> > Best,
> > Xikui
> >
> >
> >
> > On Sat, Nov 24, 2018 at 6:05 AM

Re: Feed adapter for twitter data

2018-11-14 Thread Xikui Wang
Here is an example of how you would create a socket feed with a JSON parser:

create type TweetType as {
id: int64
};

create dataset Tweets (TweetType) primary key id;

create feed TwitterFeed with {
  "adapter-name" : "socket_adapter",
  "sockets" : "127.0.0.1:10001",
  "address-type" : "IP",
  "type-name" : "TweetType",
  "format" : "json"
};

connect feed TwitterFeed to dataset Tweets;
start feed TwitterFeed;

One thing that worth noticing is JSON format has limited data types, so you
will see the timestamps are parsed as string and points are parsed as
arrays of doubles.

Best,
Xikui

On Wed, Nov 14, 2018 at 8:32 AM Xikui Wang  wrote:

> Hi Sandra,
>
> Yes. You can create a socket feed with the JSON parser. This will allow
> you to push JSON formatted Tweets into AsterixDB directly.
>
> Best,
> Xikui
>
> On Wed, Nov 14, 2018 at 3:11 AM Sandra Skarshaug <
> sandraskarsh...@gmail.com> wrote:
>
>> Hi!
>>
>> Is it possible to use the twitter feed adapter without providing consumer
>> key and access token, but instead connecting the adapter to a socket which
>> streams twitter data?
>>
>> Best regards,
>> Sandra Skarshaug
>>
>


Re: Feed adapter for twitter data

2018-11-14 Thread Xikui Wang
Hi Sandra,

Yes. You can create a socket feed with the JSON parser. This will allow you
to push JSON formatted Tweets into AsterixDB directly.

Best,
Xikui

On Wed, Nov 14, 2018 at 3:11 AM Sandra Skarshaug 
wrote:

> Hi!
>
> Is it possible to use the twitter feed adapter without providing consumer
> key and access token, but instead connecting the adapter to a socket which
> streams twitter data?
>
> Best regards,
> Sandra Skarshaug
>


Re: OptimizedHybridHashJoinOperatorDescriptor vs. HybridHashJoinOperatorDescriptor

2018-11-08 Thread Xikui Wang
Hi Yingyi,

Thanks for your reply. I think you are right. The two seem to serve the
same purpose except that the unoptimized one has less documentation and no
memory management. I will propose a patch later to remove that.

Best,
Xikui

On Thu, Nov 8, 2018 at 9:54 PM Yingyi Bu  wrote:

> I'm not sure if it's still correct, but based on my understanding,
> OptimizedHybridHashJoinOperatorDescriptor does the role reversal
> optimization which was done by Pouria, while
> HybridHashJoinOperatorDescriptor was the old implementation before Pouria's
> work and probably could be deleted.
>
> Best,
> Yingyi
>
> On Thu, Nov 8, 2018 at 6:52 PM Xikui Wang  wrote:
>
> > Hi Devs,
> >
> > Does anyone know what's the difference between
> > the OptimizedHybridHashJoinOperatorDescriptor and the
> > HybridHashJoinOperatorDescriptor?
> >
> > I was going over part of the join codes and found both of these classes
> > there. It seems the only difference between the two is when there is a
> > hashFunctionFamily is null, we will use the
> > HybridHashJoinOperatorDescriptor. However, it seems this would not
> > happen... In fact, I changed to code to always use the optimized one and
> no
> > test cases fail (could be test cases bias though). Thanks ahead!
> >
> > Best,
> > Xikui
> >
>


OptimizedHybridHashJoinOperatorDescriptor vs. HybridHashJoinOperatorDescriptor

2018-11-08 Thread Xikui Wang
Hi Devs,

Does anyone know what's the difference between
the OptimizedHybridHashJoinOperatorDescriptor and the
HybridHashJoinOperatorDescriptor?

I was going over part of the join codes and found both of these classes
there. It seems the only difference between the two is when there is a
hashFunctionFamily is null, we will use the
HybridHashJoinOperatorDescriptor. However, it seems this would not
happen... In fact, I changed to code to always use the optimized one and no
test cases fail (could be test cases bias though). Thanks ahead!

Best,
Xikui


Re: Install UDF library (NCService option)

2018-10-16 Thread Xikui Wang
The UDF package will be installed when AsterixDB starts. Since the UDFs are
installed properly, there might be something wrong with the package that
you dropped off. Could you try again with a clean AsterixDB and the
compiled package? You could try to remove some default UDFs to make sure
the package that you have is actually up-to-date.

Best,
Xikui

On Tue, Oct 16, 2018 at 1:54 AM sandraskarsh...@gmail.com <
sandraskarsh...@gmail.com> wrote:

>
>
> On 2018/10/15 17:58:38, Xikui Wang  wrote:
> > Hi Sandra,
> >
> > The UDF template repo that you used was outdated for some time. There has
> > been some UDF refactorization in AsterixDB since that. I've just updated
> > the template repo to the latest AsterixDB build. Please pull the latest
> > changes before you try again.
> >
> > If you want to install UDF with the NCService deployment, you would have
> to
> > manually copy the package content into the
> > "lib/udfs/DATAVERSE_NAME/LIB_NAME/" directory. You will probably need to
> > create all the folders including the "udfs" one. A sample structure of
> the
> > lib directory will be like this:
> >
> > lib
> > 
> > ├── stax-utils-20060502.jar
> > └── udfs
> > └── test
> > └── testlib
> > ├── asterix-udf-template-0.1-SNAPSHOT.jar
> > └── library_descriptor.xml
> >
> > About your previous question, Ansible always relies on SSH even if you
> > configure it with localhost only. It treats the localhost as a cluster
> with
> > a single machine. You may want to check the password-less configuration
> of
> > your machine to avoid that problem. If you just to want to find a fast
> way
> > to play around, the NCService deployment would be the right way to go. :)
> >
> > Best,
> > Xikui
> >
> > On Mon, Oct 15, 2018 at 8:14 AM Ian Maxon  wrote:
> >
> > > You can probably get away with it, yes. Look at what the ansible task
> > > to install the UDF does and copy it to the same location, and it'll
> > > work.
> > >
> > > On Mon, Oct 15, 2018 at 7:00 AM Sandra Skarshaug
> > >  wrote:
> > > >
> > > > Some follow up information, I have a folder structure inside
> > > asterix-server-0.9.3-binary-assembly which looks like the attached
> image.
> > > The files inside testlib are generated when building the following
> > > udf-template project: https://github.com/idleft/asterix-udf-template.
> > > >
> > > > When I try to execute the query below, it says function
> > > test.testlib#mysum@2 is not defined [CompilationException]
> > > >
> > > >  "use test;
> > > >  testlib#mysum(3,4);"
> > > >
> > > > man. 15. okt. 2018 kl. 11:36 skrev Sandra Skarshaug <
> > > sandraskarsh...@gmail.com>:
> > > >>
> > > >> Hi!
> > > >>
> > > >> I am using AsterixDB for my master thesis, and I have some issues I
> > > hope you can help me with!
> > > >>
> > > >> How do I install a UDF package when I am using NCService, not
> Ansible?
> > > Is it enough to unzip the .jar generated from my project, and add it to
> > > asterix-server-0.9.3-binary-assembly/lib/udfs/testlib? Or should I use
> my
> > > local version of asterixdb (the cloned repo), not the one installed
> through
> > > your webpage?
> > > >>
> > > >> The following information is found in the cc.log when I put the
> .jar in
> > > the location specified above (lib/udfs/testlib): ssembly/lib/udfs
> > > >>
> > > >> Oct 15, 2018 10:41:17 AM org.apache.hyracks.control.cc.CCDriver
> main
> > > >> SEVERE: Exiting CCDriver due to exception
> > > >> java.lang.ArrayIndexOutOfBoundsException: 0
> > > >> at
> > >
> org.apache.asterix.app.external.ExternalLibraryUtils.getLibraryClassLoader(ExternalLibraryUtils.java:353)
> > > >> at
> > >
> org.apache.asterix.app.external.ExternalLibraryUtils.registerLibrary(ExternalLibraryUtils.java:299)
> > > >> at
> > >
> org.apache.asterix.app.external.ExternalLibraryUtils.setUpExternaLibraries(ExternalLibraryUtils.java:78)
> > > >> at
> > >
> org.apache.asterix.hyracks.bootstrap.CCApplication.start(CCApplication.java:140)
> > > >> at org.apache.hyracks.control.cc
> > >
> .ClusterControllerService.startApplication(ClusterControllerService.java:226)
> > > >> at org.apache.hyracks.control.cc
> > > .ClusterControllerService.start(ClusterControllerService.java:212)
> > > >> at org.apache.hyracks.control.cc.CCDriver.main(CCDriver.java:47)
> > > >>
> > > >>
> > > >> I am really looking forward to your answer!
> > > >>
> > > >> Best regards,
> > > >> Sandra Skarshaug
> > >
> > Hi, thanks for your reply! It finally worked now!
>
> However, I wrote a new function and factory (in the udf-template project),
> and then updated the library_descriptor. Then I built the project, ran the
> stop-script, before I added the updated files to /testlib. When starting up
> asterixDB now, only the old (first) functions are found when executing the
> query "SELECT * FROM Metadata.`Function`;", not the new one. Any idea what
> I have to do to make the new function available as well?
>


Re: Install UDF library (NCService option)

2018-10-15 Thread Xikui Wang
Hi Sandra,

The UDF template repo that you used was outdated for some time. There has
been some UDF refactorization in AsterixDB since that. I've just updated
the template repo to the latest AsterixDB build. Please pull the latest
changes before you try again.

If you want to install UDF with the NCService deployment, you would have to
manually copy the package content into the
"lib/udfs/DATAVERSE_NAME/LIB_NAME/" directory. You will probably need to
create all the folders including the "udfs" one. A sample structure of the
lib directory will be like this:

lib

├── stax-utils-20060502.jar
└── udfs
└── test
└── testlib
├── asterix-udf-template-0.1-SNAPSHOT.jar
└── library_descriptor.xml

About your previous question, Ansible always relies on SSH even if you
configure it with localhost only. It treats the localhost as a cluster with
a single machine. You may want to check the password-less configuration of
your machine to avoid that problem. If you just to want to find a fast way
to play around, the NCService deployment would be the right way to go. :)

Best,
Xikui

On Mon, Oct 15, 2018 at 8:14 AM Ian Maxon  wrote:

> You can probably get away with it, yes. Look at what the ansible task
> to install the UDF does and copy it to the same location, and it'll
> work.
>
> On Mon, Oct 15, 2018 at 7:00 AM Sandra Skarshaug
>  wrote:
> >
> > Some follow up information, I have a folder structure inside
> asterix-server-0.9.3-binary-assembly which looks like the attached image.
> The files inside testlib are generated when building the following
> udf-template project: https://github.com/idleft/asterix-udf-template.
> >
> > When I try to execute the query below, it says function
> test.testlib#mysum@2 is not defined [CompilationException]
> >
> >  "use test;
> >  testlib#mysum(3,4);"
> >
> > man. 15. okt. 2018 kl. 11:36 skrev Sandra Skarshaug <
> sandraskarsh...@gmail.com>:
> >>
> >> Hi!
> >>
> >> I am using AsterixDB for my master thesis, and I have some issues I
> hope you can help me with!
> >>
> >> How do I install a UDF package when I am using NCService, not Ansible?
> Is it enough to unzip the .jar generated from my project, and add it to
> asterix-server-0.9.3-binary-assembly/lib/udfs/testlib? Or should I use my
> local version of asterixdb (the cloned repo), not the one installed through
> your webpage?
> >>
> >> The following information is found in the cc.log when I put the .jar in
> the location specified above (lib/udfs/testlib): ssembly/lib/udfs
> >>
> >> Oct 15, 2018 10:41:17 AM org.apache.hyracks.control.cc.CCDriver main
> >> SEVERE: Exiting CCDriver due to exception
> >> java.lang.ArrayIndexOutOfBoundsException: 0
> >> at
> org.apache.asterix.app.external.ExternalLibraryUtils.getLibraryClassLoader(ExternalLibraryUtils.java:353)
> >> at
> org.apache.asterix.app.external.ExternalLibraryUtils.registerLibrary(ExternalLibraryUtils.java:299)
> >> at
> org.apache.asterix.app.external.ExternalLibraryUtils.setUpExternaLibraries(ExternalLibraryUtils.java:78)
> >> at
> org.apache.asterix.hyracks.bootstrap.CCApplication.start(CCApplication.java:140)
> >> at org.apache.hyracks.control.cc
> .ClusterControllerService.startApplication(ClusterControllerService.java:226)
> >> at org.apache.hyracks.control.cc
> .ClusterControllerService.start(ClusterControllerService.java:212)
> >> at org.apache.hyracks.control.cc.CCDriver.main(CCDriver.java:47)
> >>
> >>
> >> I am really looking forward to your answer!
> >>
> >> Best regards,
> >> Sandra Skarshaug
>


Re: [VOTE] Release AsterixDB 0.9.4 and Hyracks 0.3.4 (RC2)

2018-09-28 Thread Xikui Wang
+1

Verify that the Twitter feed works.

On Mon, Sep 24, 2018 at 2:57 PM Ian Maxon  wrote:

> Sorry, my mv got a bit screwed up. They are sha256, i'll rename the files.
> On Sat, Sep 22, 2018 at 3:28 PM Taewoo Kim  wrote:
> >
> > +1
> >
> > [v] Verify the signatures and hashes.
> > [v] Verify that source builds correctly.
> > [v] Verify that a query works.
> >
> > Only one minor comment:
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.4-source-release.zip.sha25
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.4.zip.256
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb_0.9.4_all.deb.256
> >
> > The extension of these files should be .sha256.
> >
> >
> >
> >
> >
> > On Thu, Sep 20, 2018 at 4:13 PM Wail Alkowaileet 
> wrote:
> >
> > > Then it looks good for me +1
> > >
> > > On Wed, Sep 19, 2018 at 4:36 PM Ian Maxon  wrote:
> > >
> > > > I moved the file, sorry about that, didn't notice. The tests should
> be
> > > > OK; it all passes on Gerrit right now.
> > > > On Tue, Sep 18, 2018 at 3:54 PM Wail Alkowaileet  >
> > > > wrote:
> > > > >
> > > > >- mvn verify reports two issues (not sure about severity of
> them):
> > > > >
> > > > >
> > > > >1. SqlppExecutionWithCancellationTest.tearDown:54 There are 15
> > > leaked
> > > > >run files.
> > > > >2. DiskIsFullTest.testDiskIsFull:179 Expected exception
> > > > >(org.apache.hyracks.api.exceptions.HyracksDataException:
> HYR0088:
> > > > Cannot
> > > > >modify index (Disk is full)) was not thrown
> > > > >
> > > > > The latter does not seem to skip the test for (macOS High Sierra)
> and
> > > it
> > > > > seems it has been removed from the current master.
> > > > >
> > > > >- signatures and hashes looks good.
> > > > >
> > > > > One thing is the asc file for the AsterixDB Installer should be
> > > > > renamed: from asterix-server-0.9.4.zip.asc to
> > > > apache-asterixdb-0.9.4.zip.asc
> > > > >
> > > > >
> > > > > On Fri, Sep 7, 2018 at 1:55 PM Ian Maxon  wrote:
> > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > Please verify and vote on the latest release of Apache AsterixDB
> > > > > >
> > > > > > The change that produced this release and the change to advance
> the
> > > > > > version are
> > > > > > up for review on Gerrit:
> > > > > >
> > > > > >
> > > > > >
> > > >
> > >
> https://asterix-gerrit.ics.uci.edu/#/q/status:open+owner:%22Jenkins+%253Cjenkins%2540fulliautomatix.ics.uci.edu%253E%22
> > > > > >
> > > > > > The release artifacts are as follows:
> > > > > >
> > > > > > AsterixDB Source
> > > > > >
> > > > > >
> > > >
> > >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.4-source-release.zip
> > > > > >
> > > > > >
> > > >
> > >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.4-source-release.zip.asc
> > > > > >
> > > > > >
> > > >
> > >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.4-source-release.zip.sha25
> > > > > >
> > > > > > SHA256:
> > > > 2bedc3e30bdebdc26ae7fdbe4ce9b2ec8d546a195ee8bc05f7e0e516e747bfe8
> > > > > >
> > > > > > Hyracks Source
> > > > > >
> > > > > >
> > > >
> > >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyracks-0.3.4-source-release.zip
> > > > > >
> > > > > >
> > > >
> > >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyracks-0.3.4-source-release.zip.asc
> > > > > >
> > > > > >
> > > >
> > >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyracks-0.3.4-source-release.zip.sha256
> > > > > >
> > > > > > SHA256:
> > > > 8d3d8c734d0e49b145619d8e083aea4cd599adb2b9fe148b05eac8550caf1764
> > > > > >
> > > > > > AsterixDB Installer:
> > > > > >
> > > >
> > >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.4.zip
> > > > > >
> > > > > >
> > > >
> > >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.4.zip.asc
> > > > > >
> > > > > >
> > > >
> > >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.4.zip.256
> > > > > >
> > > > > > SHA256:
> > > > 0b939231635f0c2328018f7064df9a4fa4b05b36835127a12eae4543141aecd9
> > > > > >
> > > > > > AsterixDB Debian/Ubuntu Package:
> > > > > >
> > > > > >
> > > >
> > >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb_0.9.4_all.deb
> > > > > >
> > > > > >
> > > >
> > >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb_0.9.4_all.deb.asc
> > > > > >
> > > > > >
> > > >
> > >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb_0.9.4_all.deb.256
> > > > > >
> > > > > > SHA256:
> > > > c41fc765f04cb335c5fb728af625217289f64886d05d724e3ef6aa140d4437f5
> > > > > >
> > > > > > The KEYS file containing the PGP keys used to sign the release
> can be
> > > > > > found at
> > > > > >
> > > > > > https://dist.apache.org/repos/dist/release/asterixdb/KEYS
> > > > > >
> > > > > > RAT was executed as part of Maven via the RAT maven plugin, but
> > > > > > excludes files that are:
> 

Re: [VOTE] Release Apache AsterixDB 0.9.4 and Hyracks 0.3.4 (RC1)

2018-07-19 Thread Xikui Wang
+1

- SHA1 verified for NCService Installer.
- Drop-in Twitter4j jar and verified twitter feed.
- UDF installation and query work fine.

Best,
Xikui

On Thu, Jul 19, 2018 at 1:41 PM Ian Maxon  wrote:

> Here's a summary of all issues addressed in some way for this release,
> since last release:
>
> [ASTERIXDB-2397][*DB] Enable build on Java 10
> [ASTERIXDB-2397][*DB] Fix sample cluster on Java 10
> [ASTERIXDB-2397][*DB] Enable execution on Java 9/10
> [ASTERIXDB-2396][LIC] Include netty-all required NOTICEs
> [ASTERIXDB-2318] Build dashboard in mvn
> [ASTERIXDB-2387][MTD] Prevent Dataset Primary Index Drop
> [ASTERIXDB-2377][OTH] Fix JSON of Additional Expressions
> [ASTERIXDB-2354][COMP] Partition constraint propagation for binary
> operators
> [ASTERIXDB-2355][SQL] Incorrect error reporting by SQL++ parser
> [ASTERIXDB-2358][LIC] Fix asterix-replication packaging
> [ASTERIXDB-2347][DOC] Update Configurable Parameters
> [ASTERIXDB-2361][HYR] Memory Leak Due to Netty Close Listeners
> [ASTERIXDB-2353][HYR][RT][FAIL] Provide complete thread dumps
> [ASTERIXDB-2352][FUN] Incorrect leap year handling in duration arithmetic
> [ASTERIXDB-2351][COMP] Allow '+' after exponent indicator in double
> literals
> [ASTERIXDB-2343][FUN] Implement to_array(), to_atomic(), to_object()
> [ASTERIXDB-2348][COMP] Incorrect result with distinct aggregate
> [ASTERIXDB-2346][COMP] Constant folding should not fail on runtime
> exceptions
> [ASTERIXDB-2345][FUN] Fix runtime output type for object_names()
> [ASTERIXDB-2340][FUN] Implement object_length(), object_names()
> [ASTERIXDB-2334] Fix Range Predicate for Composite Key Search
> [ASTERIXDB-2216] Disable flaky test which depends on external site
> [ASTERIXDB-1708][TX] Prevent log deletion during scan
> [ASTERIXDB-2321][STO] Follow the contract in IIndexCursor.open calls
> [ASTERIXDB-2332][RT] Fix concurrency issue with RecordMerge and
> RecordRemoveFields
> [ASTERIXDB-2308][STO] Prevent Race To Allocate Memory Components
> [ASTERIXDB-2319][TEST] Split Queries in start-feed Test
> [ASTERIXDB-2285][TEST] Increase Poll Time on Test
> [ASTERIXDB-2317] Intermittent Failure in Kill CC NCServiceExecutionIT
> [ASTERIXDB-2329][MTD] Remove Invalid Find Dataset
> [ASTERIXDB-2330][*DB][RT] Add IFunctionRegistrant for dynamic function
> registration
> [ASTERIXDB-2320][CLUS] Don't delay removing dead node on max heartbeat
> misses
> [ASTERIXDB-2316][STO] Fix Merging Components For Full Merge
> [ASTERIXDB-2213] Guard against concurrent config updates
> [ASTERIXDB-1952][TX][IDX] Filter logs pt.2
> [ASTERIXDB-1280][TEST] JUnit cleanup
> [ASTERIXDB-2305][FUN] replace() should not accept regular expressions
> [ASTERIXDB-2313][EXT] JSONDataParser support for non-object roots
> [ASTERIXDB-2229][OTR] Restore Thread Names in Thread Pool
> [ASTERIXDB-2304] Ensure Flush is Finished in FlushRecoveryTest
> [ASTERIXDB-2307][COMP] Incorrect result with quantified expression
> [ASTERIXDB-2303][API] Fix Supplementary Chars Printing
> [ASTERIXDB-2148][FUN] Add init parameter for external UDF
> [ASTERIXDB-2301][TX] Fix Abort of DELETE operation
> [ASTERIXDB-2074][MVN] Fix manifest metadata
> [ASTERIXDB-2302][COMP] Incorrect result with non-enforced index
> [ASTERIXDB-2299] Set log type properly during modifications
> [ASTERIXDB-2188] Ensure recovery of component ids
> [ASTERIXDB-2227][ING] Enabling filitering incoming data in feed
> [ASTERIXDB-2296][COMP] proper handling of an optional subfield type
> [ASTERIXDB-2291][FUN] Implement if_inf,if_nan,if_nan_or_inf
> [ASTERIXDB-2294][FUN] Implement is_atomic()
> [ASTERIXDB-2287][SQL] Support SELECT variable.* in SQL++
> [ASTERIXDB-2083][COMP][RT][IDX][SITE] Budget-Constrained Inverted index
> search
> [ASTERIXDB-2282][HTTP] Revive HTTP server on unexpected channel drops
> [ASTERIXDB-1972][COMP][RT][TX] index-only plan
> [ASTERIXDB-2284][CLUS] Ensure Node Failure on Heartbeat Miss
> [ASTERIXDB-2271][RT] Remove Result Ref of Aborted Jobs
> [ASTERIXDB-2204][STO] Fix implementations and usages of IIndexCursor
>
> On Tue, Jul 17, 2018 at 7:10 PM, Ian Maxon  wrote:
> > Hi everyone,
> >
> > Please verify and vote on the latest release of Apache AsterixDB
> >
> > The change that produced this release and the change to advance the
> version are
> > up for review on Gerrit:
> >
> >
> https://asterix-gerrit.ics.uci.edu/#/q/status:open+owner:%22Jenkins+%253Cjenkins%2540fulliautomatix.ics.uci.edu%253E%22
> >
> > The release artifacts are as follows:
> >
> > AsterixDB Source
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.4-source-release.zip
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.4-source-release.zip.asc
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.4-source-release.zip.sha1
> >
> > SHA1:d991d2d825197f73cad015870a5eb014291e03ad
> >
> > Hyracks Source
> >
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyracks-0.3.4-source-release.zip
> >
> 

Re: [VOTE] Release Apache AsterixDB 0.9.4 and Hyracks 0.3.4 (RC0)

2018-07-17 Thread Xikui Wang
Yes. The UDF is compiled on my machine that has Java 8 installed, and the
cluster has Java 8 as well.

Best,
Xikui

On Tue, Jul 17, 2018 at 10:16 AM Ian Maxon  wrote:

> Not sure, but it looks like an issue with using java 10 when compiled
> with java 8 or vice-versa. You're certain the UDF is compiled with 8?
>
> On Tue, Jul 17, 2018 at 12:00 AM, Xikui Wang  wrote:
> > I notice that the latest master has a problem with running UDF on a
> > cluster. When a UDF is deployed to the cluster, AsterixDB would fail to
> > start due to the following exception:
> >
> >
> > Exception in thread "main" java.lang.NoClassDefFoundError:
> > com/sun/xml/bind/v2/model/annotation/AnnotationReader
> >
> >   at java.lang.ClassLoader.defineClass1(Native Method)
> >
> >   at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
> >
> >   at
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> >
> >   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
> >
> >   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
> >
> >   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
> >
> >   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
> >
> >   at java.security.AccessController.doPrivileged(Native Method)
> >
> >   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
> >
> >   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> >
> >   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
> >
> >   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> >
> >   at java.lang.Class.getDeclaredMethods0(Native Method)
> >
> >   at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
> >
> >   at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
> >
> >   at java.lang.Class.getMethod0(Class.java:3018)
> >
> >   at java.lang.Class.getMethod(Class.java:1784)
> >
> >   at javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:242)
> >
> >   at javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:234)
> >
> >   at javax.xml.bind.ContextFinder.find(ContextFinder.java:441)
> >
> >   at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:641)
> >
> >   at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:584)
> >
> >   at
> >
> org.apache.asterix.app.external.ExternalLibraryUtils.getLibrary(ExternalLibraryUtils.java:325)
> >
> >   at
> >
> org.apache.asterix.app.external.ExternalLibraryUtils.configureLibrary(ExternalLibraryUtils.java:288)
> >
> >   at
> >
> org.apache.asterix.app.external.ExternalLibraryUtils.setUpExternaLibraries(ExternalLibraryUtils.java:81)
> >
> >   at
> >
> org.apache.asterix.hyracks.bootstrap.CCApplication.start(CCApplication.java:147)
> >
> >   at
> > org.apache.hyracks.control.cc
> .ClusterControllerService.startApplication(ClusterControllerService.java:236)
> >
> >   at
> > org.apache.hyracks.control.cc
> .ClusterControllerService.start(ClusterControllerService.java:222)
> >
> >   at org.apache.hyracks.control.cc.CCDriver.main(CCDriver.java:48)
> >
> > Caused by: java.lang.ClassNotFoundException:
> > com.sun.xml.bind.v2.model.annotation.AnnotationReader
> >
> >   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> >
> >   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> >
> >   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
> >
> >   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> >
> >   ... 29 more
> >
> >
> > By comparing the builds, I found the problem occurs after the merge of
> this
> > patch[1] and it is in this release as well... Do we have a quick fix for
> > this?
> >
> > [1] https://asterix-gerrit.ics.uci.edu/#/c/2696/11
> >
> >
> > Best,
> > Xikui
> >
> >
> >
> >
> > On Mon, Jul 16, 2018 at 10:41 PM Ian Maxon  wrote:
> >
> >> Hi everyone,
> >>
> >> Please verify and vote on the latest release of Apache AsterixDB
> >>
> >> The change that produced this release and the change to advance the
> >> version are
> >> up for review here:
> >>
> >> https://asterix-gerrit.ics.uci.edu/#/c/2773/
> >> https://asterix-gerrit.ics.uci.edu/#/c/2772/
> >>
> >> To check out the release, simply fetch the review and check out the
> >> fetch head like so:
> >>
> >> git fetch ht

Re: [VOTE] Release Apache AsterixDB 0.9.4 and Hyracks 0.3.4 (RC0)

2018-07-17 Thread Xikui Wang
I notice that the latest master has a problem with running UDF on a
cluster. When a UDF is deployed to the cluster, AsterixDB would fail to
start due to the following exception:


Exception in thread "main" java.lang.NoClassDefFoundError:
com/sun/xml/bind/v2/model/annotation/AnnotationReader

  at java.lang.ClassLoader.defineClass1(Native Method)

  at java.lang.ClassLoader.defineClass(ClassLoader.java:763)

  at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)

  at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)

  at java.net.URLClassLoader.access$100(URLClassLoader.java:73)

  at java.net.URLClassLoader$1.run(URLClassLoader.java:368)

  at java.net.URLClassLoader$1.run(URLClassLoader.java:362)

  at java.security.AccessController.doPrivileged(Native Method)

  at java.net.URLClassLoader.findClass(URLClassLoader.java:361)

  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)

  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)

  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

  at java.lang.Class.getDeclaredMethods0(Native Method)

  at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)

  at java.lang.Class.privateGetMethodRecursive(Class.java:3048)

  at java.lang.Class.getMethod0(Class.java:3018)

  at java.lang.Class.getMethod(Class.java:1784)

  at javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:242)

  at javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:234)

  at javax.xml.bind.ContextFinder.find(ContextFinder.java:441)

  at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:641)

  at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:584)

  at
org.apache.asterix.app.external.ExternalLibraryUtils.getLibrary(ExternalLibraryUtils.java:325)

  at
org.apache.asterix.app.external.ExternalLibraryUtils.configureLibrary(ExternalLibraryUtils.java:288)

  at
org.apache.asterix.app.external.ExternalLibraryUtils.setUpExternaLibraries(ExternalLibraryUtils.java:81)

  at
org.apache.asterix.hyracks.bootstrap.CCApplication.start(CCApplication.java:147)

  at
org.apache.hyracks.control.cc.ClusterControllerService.startApplication(ClusterControllerService.java:236)

  at
org.apache.hyracks.control.cc.ClusterControllerService.start(ClusterControllerService.java:222)

  at org.apache.hyracks.control.cc.CCDriver.main(CCDriver.java:48)

Caused by: java.lang.ClassNotFoundException:
com.sun.xml.bind.v2.model.annotation.AnnotationReader

  at java.net.URLClassLoader.findClass(URLClassLoader.java:381)

  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)

  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)

  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

  ... 29 more


By comparing the builds, I found the problem occurs after the merge of this
patch[1] and it is in this release as well... Do we have a quick fix for
this?

[1] https://asterix-gerrit.ics.uci.edu/#/c/2696/11


Best,
Xikui




On Mon, Jul 16, 2018 at 10:41 PM Ian Maxon  wrote:

> Hi everyone,
>
> Please verify and vote on the latest release of Apache AsterixDB
>
> The change that produced this release and the change to advance the
> version are
> up for review here:
>
> https://asterix-gerrit.ics.uci.edu/#/c/2773/
> https://asterix-gerrit.ics.uci.edu/#/c/2772/
>
> To check out the release, simply fetch the review and check out the
> fetch head like so:
>
> git fetch https://asterix-gerrit.ics.uci.edu:29418/asterixdb
> refs/changes/72/2772/1 && git checkout FETCH_HEAD
>
>
> AsterixDB Source
>
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.4-source-release.zip
>
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.4-source-release.zip.asc
>
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-asterixdb-0.9.4-source-release.zip.sha1
>
> SHA1:7ca7dee5408fb77010bdd1cde83a35452b087385
>
> Hyracks Source
>
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyracks-0.3.4-source-release.zip
>
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyracks-0.3.4-source-release.zip.asc
>
> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyracks-0.3.4-source-release.zip.sha1
>
> SHA1:17654682f9cb6f5ad9811fd644c954afa330ce01
>
> AsterixDB NCService Installer:
>
> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-server-0.9.4-binary-assembly.zip
>
> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-server-0.9.4-binary-assembly.zip.asc
>
> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-server-0.9.4-binary-assembly.zip.sha1
>
> SHA1:a0931dc6aedab4007112ee75b62028336382fb72
>
> Additionally, a staged maven repository is available at:
>
> https://repository.apache.org/content/repositories/orgapacheasterix-1043/
>
> The KEYS file containing the PGP keys used to sign the release can be
> found at
>
> https://dist.apache.org/repos/dist/release/asterixdb/KEYS
>
> RAT was executed as part of Maven via the RAT maven plugin, but
> excludes files that 

Re: Several nodes on one machine with ansible

2018-06-20 Thread Xikui Wang
Hi Kristoffer,

For the UDF installation, what ansible does is to copy things over to the
right place. Similarly, if you copy the unarchived UDF library to the right
local directory, it will be picked up by the system as well. For the local
instance case, copy the unarchived files to "asterix
-server-0.9.4-SNAPSHOT-binary-assembly/lib/udfs/DV_NAME/LIB_NAME/" can help
you get a local instance with UDF installed.

An example with the default library installed on my machine looks like
following:

/Users/xikuiw/Temp/asterix-server-0.9.4-SNAPSHOT-binary-assembly/lib/udfs/test/test
% ls

DEPENDENCIES   NOTICE
   change_feed.csv
LICENSE
asterix-external-data-0.9.4-SNAPSHOT-tests.jar library_descriptor.xml

Hope this can help you.

Best,
Xikui

On Wed, Jun 20, 2018 at 2:18 AM, Kristoffer Finckenhagen <
kristoffer.finckenha...@gmail.com> wrote:

> Ok, thanks. I've got it running with the scripts in
> asterix-server/ttarget/asterix-server-*-binary-assembly/opt/local/bin
> folder. But I'm unsure of how to get my UDFs to work with this setup. I can
> only find documentation on installing them with the deprecated managix and
> ansible, how would i go about doing this?
>
> Again, thanks for the help and fast response.
>
> Kristoffer
>
> On Tue, 12 Jun 2018 at 22:38 Ian Maxon  wrote:
>
> > If you're using only one machine, there's not much point to using
> > ansible. The easiest thing would be to just specify multiple iodevices
> > with one NC, or start multiple NCServices. Take a look at the
> > start-sample-cluster.sh script to see how it starts things, it
> > actually runs two NCs locally.
> >
> > On Tue, Jun 12, 2018 at 6:38 AM, Kristoffer Finckenhagen
> >  wrote:
> > > Hi.
> > >
> > > I'm currently running some neural nets for sentiment analysis as udf on
> > > asterix feeds, deploying the cluster with ansible. Currently there is
> > only
> > > one nc running on localhost, but i would have liked to test with
> several
> > on
> > > the same machine, I'm just not completely sure how. Any help would be
> > > greatly appreciated.
> > >
> > > Regards
> > > Kristoffer
> > > --
> > > Mvh Kristoffer Finckenhagen
> >
> --
> Mvh Kristoffer Finckenhagen
>


Inline WithExpression in SQLPP function

2018-05-30 Thread Xikui Wang
Hi Devs,

It seems we choose not to inline the WithExpression intentionally in SQLPP
function body rewriting. Is there any particular reason for doing that?
Thanks in advance!

Best,
Xikui


Debugging in development

2018-05-04 Thread Xikui Wang
Hi,

I was trying to add more info about the testing problem that we discussed
this morning. Then, I found that we actually have a page on confluence on
debugging [1]. I added two bullets talking about the use of
"SqlppExecutionTest" and "SqlExecutionIT" (Please help revise and add
more!). Hope they could be useful for those who just started working on
AsterixDB. Also, there are some useful design docs on confluence [2] that
were added by previous developers. Have a look at it if you don't know the
existence of these materials (like me in my first year). :)

[1] https://cwiki.apache.org/confluence/display/ASTERIXDB/Debugging
[2] https://cwiki.apache.org/confluence/display/ASTERIXDB/Home

Best,
Xikui


Optimizer Tests for SQLPP

2018-04-14 Thread Xikui Wang
Hi Devs,

As I mentioned in the weekly meeting, I found that our OptimizerTest
actually doesn't run the SQLPP tests. Although there is a separate
directory 'queries_sqlpp' which contains all the legacy optimizer tests
translated into SQLPP, they are not picked up by the OptimizerTest, and the
new SQLPP tests are still being added to the old directory and mixed up
with AQL tests.

I tried to run those SQLPP tests. More than half of them are failed. There
are syntax error (query-issue838.sqlpp), variable name changes, join
algorithm changes (word-jaccard.sqlpp) and other changes (issue730.sqlpp).
I submitted one patch that fixed the test cases with variable name changes.
For the rests, I think we need to decide, between the two versions of the
results, which ones are the expected plans and fix the errors. There are
some obvious patterns in the plan changes, so I think we only need to fix a
few things to cover the rest 450 test cases...

Best,
Xikui


Re: Specification of "Expression" in SQLPP

2018-03-15 Thread Xikui Wang
The patch is under review now. :)

Best,
Xikui

On Thu, Mar 15, 2018 at 8:47 PM, Mike Carey <dtab...@gmail.com> wrote:

> Let's do it!  No issues for the rewrite step that follows when inlining,
> hopefully?
>
> On Thu, Mar 15, 2018, 1:19 PM Dmitry Lychagin <
> dmitry.lycha...@couchbase.com>
> wrote:
>
> > Xikui,
> >
> > Right, seems like we could change the FunctionDeclaration production to
> > include SelectExpression as you suggested.
> >
> > Thanks,
> > -- Dmitry
> >
> > On 3/15/18, 12:16 PM, "Xikui Wang" <xik...@uci.edu> wrote:
> >
> > The issue that I'm looking at is about UDF specification. Currently,
> > we use
> > this:
> >
> > FunctionDeclaration::=  Identifier ParameterList
> >  Expression ,
> >
> > which enforces the "SelectExpression" to be put into parentheses so
> it
> > can
> > be matched by "OperatorExpr" (see the specification of Expression in
> my
> > previous email). This I think probably is not necessary. Similar
> syntax
> > without explicit parentheses is available in AQL. There was a test
> > case for
> > this and it's disabled for now.
> >
> > As suggested by Mike, I went through the usages of "Expression". I
> > guess
> > the separation is to make sure all "SelectExpression"s are
> > parenthesized if
> > it's not a top-level query. For that purpose, maybe I can change the
> > Function Declaration to this:
> >
> > FunctionDeclaration::=  Identifier ParameterList
> >  SelectExpression | Expression   ?
> >
> > Best,
> > Xikui
> >
> > On Thu, Mar 15, 2018 at 10:45 AM, Dmitry Lychagin <
> > dmitry.lycha...@couchbase.com> wrote:
> >
> > > Xikui,
> > >
> > > "(" SelectExpression ")" is permitted by the Subquery() production,
> > and
> > > Subquery() itself is one of the alternatives in the
> > > ParenthesizedExpression().
> > >
> > > What is the compilation issue you were trying to solve?
> > >
> > > Thanks,
> > > -- Dmitry
> > >
> > > On 3/14/18, 11:52 PM, "Mike Carey" <dtab...@gmail.com> wrote:
> > >
> > > Not sure, but I don't think this is (nearly) sufficient
> > context/info to
> > > see what's going on.  With the current factoring of things, any
> > other
> > > place that includes Expression is not going to allow a
> > SelectExpression
> > > to appear directly as an Expression.  Your change would - which
> > might
> > > be
> > > a major change, syntactically, that might lead to a variety of
> > > ambiguities.  Without looking at the whole grammar one can't
> > tell.  I
> > > would guess that if you look, you may find that you can have a
> > > SelectExpression as a query if and only if its enclosed in
> > parentheses,
> > > which might be by design to avoid ambiguities. Have a look at
> > that
> > > (Basically, look at the other uses of Expression in the
> grammar -
> > > and/or
> > > look to see if "(" SelectExpression ")" is permitted if you
> > follow
> > > through the grammar from Expression.
> > >
> > >
> > > On 3/14/18 8:59 PM, Xikui Wang wrote:
> > > > Dear Devs,
> > > >
> > > > I'm trying to fix a compilation issue with the subquery in
> > SQLPP and
> > > got a
> > > > question about the specification of "Expression". Here is the
> > current
> > > > grammar of "Query" and "Expression" in SQLPP:
> > > >
> > > > Query::=( Expression | SelectExpression )
> > > > Expression::=( OperatorExpr | CaseExpr |
> QuantifiedExpression )
> > > >
> > > > I'm wondering why "SelectExpression" is not in the
> > specification of
> > > > "Expression" but in "Query". When I looked back to the AQL
> > > specification, I
> > > > found that we have:
> > > >
> > > > Query::=Expression
> > > > Expression::=( OperatorExpr | IfThenElse | FLWOGR |
> > > QuantifiedExpression )
> > > >
> > > > If this specification in SQLPP is not intentionally designed,
> > can we
> > > change
> > > > it to this:
> > > >
> > > > Query::=( Expression )
> > > > Expression::=( OperatorExpr | CaseExpr |
> QuantifiedExpression |
> > > > SelectExpression ) ?
> > > >
> > > > As the Subquery are handled separately in the parenthesized
> > > expression
> > > > part, the "SelectExpression" here is non-subquery by default.
> > > >
> > > > Any thoughts? Thanks!
> > > >
> > > > Best,
> > > > Xikui
> > > >
> > >
> > >
> > >
> > >
> >
> >
> >
>


Re: Specification of "Expression" in SQLPP

2018-03-15 Thread Xikui Wang
The issue that I'm looking at is about UDF specification. Currently, we use
this:

FunctionDeclaration::=  Identifier ParameterList
 Expression ,

which enforces the "SelectExpression" to be put into parentheses so it can
be matched by "OperatorExpr" (see the specification of Expression in my
previous email). This I think probably is not necessary. Similar syntax
without explicit parentheses is available in AQL. There was a test case for
this and it's disabled for now.

As suggested by Mike, I went through the usages of "Expression". I guess
the separation is to make sure all "SelectExpression"s are parenthesized if
it's not a top-level query. For that purpose, maybe I can change the
Function Declaration to this:

FunctionDeclaration::=  Identifier ParameterList
 SelectExpression | Expression   ?

Best,
Xikui

On Thu, Mar 15, 2018 at 10:45 AM, Dmitry Lychagin <
dmitry.lycha...@couchbase.com> wrote:

> Xikui,
>
> "(" SelectExpression ")" is permitted by the Subquery() production, and
> Subquery() itself is one of the alternatives in the
> ParenthesizedExpression().
>
> What is the compilation issue you were trying to solve?
>
> Thanks,
> -- Dmitry
>
> On 3/14/18, 11:52 PM, "Mike Carey" <dtab...@gmail.com> wrote:
>
> Not sure, but I don't think this is (nearly) sufficient context/info to
> see what's going on.  With the current factoring of things, any other
> place that includes Expression is not going to allow a SelectExpression
> to appear directly as an Expression.  Your change would - which might
> be
> a major change, syntactically, that might lead to a variety of
> ambiguities.  Without looking at the whole grammar one can't tell.  I
> would guess that if you look, you may find that you can have a
> SelectExpression as a query if and only if its enclosed in parentheses,
> which might be by design to avoid ambiguities. Have a look at that
> (Basically, look at the other uses of Expression in the grammar -
> and/or
> look to see if "(" SelectExpression ")" is permitted if you follow
> through the grammar from Expression.
>
>
> On 3/14/18 8:59 PM, Xikui Wang wrote:
> > Dear Devs,
> >
> > I'm trying to fix a compilation issue with the subquery in SQLPP and
> got a
> > question about the specification of "Expression". Here is the current
> > grammar of "Query" and "Expression" in SQLPP:
> >
> > Query::=( Expression | SelectExpression )
> > Expression::=( OperatorExpr | CaseExpr | QuantifiedExpression )
> >
> > I'm wondering why "SelectExpression" is not in the specification of
> > "Expression" but in "Query". When I looked back to the AQL
> specification, I
> > found that we have:
> >
> > Query::=Expression
> > Expression::=( OperatorExpr | IfThenElse | FLWOGR |
> QuantifiedExpression )
> >
> > If this specification in SQLPP is not intentionally designed, can we
> change
> > it to this:
> >
> > Query::=( Expression )
> > Expression::=( OperatorExpr | CaseExpr | QuantifiedExpression |
> > SelectExpression ) ?
> >
> > As the Subquery are handled separately in the parenthesized
> expression
> > part, the "SelectExpression" here is non-subquery by default.
> >
> > Any thoughts? Thanks!
> >
> > Best,
> > Xikui
> >
>
>
>
>


Specification of "Expression" in SQLPP

2018-03-14 Thread Xikui Wang
Dear Devs,

I'm trying to fix a compilation issue with the subquery in SQLPP and got a
question about the specification of "Expression". Here is the current
grammar of "Query" and "Expression" in SQLPP:

Query::=( Expression | SelectExpression )
Expression::=( OperatorExpr | CaseExpr | QuantifiedExpression )

I'm wondering why "SelectExpression" is not in the specification of
"Expression" but in "Query". When I looked back to the AQL specification, I
found that we have:

Query::=Expression
Expression::=( OperatorExpr | IfThenElse | FLWOGR | QuantifiedExpression )

If this specification in SQLPP is not intentionally designed, can we change
it to this:

Query::=( Expression )
Expression::=( OperatorExpr | CaseExpr | QuantifiedExpression |
SelectExpression ) ?

As the Subquery are handled separately in the parenthesized expression
part, the "SelectExpression" here is non-subquery by default.

Any thoughts? Thanks!

Best,
Xikui


Re: URGENT: Please shorten test filenames!

2018-03-09 Thread Xikui Wang
I see... I was able to avoid the issue by this some time ago... I guess the
filenames weren't this long back then. :(

Best,
Xikui

On Fri, Mar 9, 2018 at 2:18 PM, Chris Hillery <chill...@hillery.land> wrote:

> On Fri, Mar 9, 2018 at 8:17 AM, Xikui Wang <xik...@uci.edu> wrote:
>
>> One quick workaround is to put the project under the root, e.g., C:/.
>> This is not a good solution but it will enable you to at least build the
>> project...
>>
>
> It doesn't, though. As I said, AsterixDB all by itself cannot fit in 260
> characters, even when checked out directly into C:\...
>
> Ceej
> aka Chris Hillery
>


Re: URGENT: Please shorten test filenames!

2018-03-09 Thread Xikui Wang
One quick workaround is to put the project under the root, e.g., C:/. This
is not a good solution but it will enable you to at least build the
project...

Best,
Xikui

On Fri, Mar 9, 2018 at 07:30 Taewoo Kim  wrote:

> Hi Chris,
>
> I will take care of this today.
>
> Best,
> Taewoo
>
> On Fri, Mar 9, 2018 at 2:01 AM, Chris Hillery 
> wrote:
>
> > There are a number of files in asterixdb with extremely long paths; the
> > worst offender currently is
> >
> > asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/index-
> > leftouterjoin/probe-sidx-btree-non-indexonly-plan-with-
> > join-btree-sidx1-indexonly-plan/probe-sidx-btree-non-
> > indexonly-plan-with-join-btree-sidx1-indexonly-plan.2.update.sqlpp
> >
> > which is 248 characters long. Counting the name of the asterixdb/ source
> > directory itself, that's 258 characters.
> >
> > Fun fact: On Windows, the longest allowable path is 260 characters,
> > including the three leading C:\ characters. That means as of February 15
> > (when this file was added), it's impossible to check out AsterixDB on
> > Windows.
> >
> > For us over at Couchbase, this has in fact broken some of our build jobs,
> > so it is a matter of some urgency. The absolute shortest path we can
> check
> > out AsterixDB into is C:\t\analytics\asterixdb\, which is 25 characters
> > long. That means the absolute longest total path in asterixdb cannot
> exceed
> > 235 characters. And really, it's quite frustrating to only be able to
> check
> > out code into a single-letter directory like C:\t\, so it would certainly
> > be nice to have at least a couple dozen characters for our own layout,
> like
> > C:\Jenkins\workspace\name-of-build-job.
> >
> > Can we please do two things:
> >
> > *1. ASAP rename the test files* introduced by commit c3c2357 to something
> > at least 20 characters shorter. I'm not sure that this is the only commit
> > causing trouble, but I can tell you that only files in the following two
> > directories are assuredly breaking things and they were both introduced
> by
> > that commit:
> >
> > asterixdb/asterix-app/src/test/resources/runtimets/
> > queries_sqlpp/index-join/btree-secondary-non-indexonly-
> > plan-to-secondary-indexonly-plan-equi-join_01/
> >
> > asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/index-
> > leftouterjoin/probe-sidx-btree-indexonly-plan-with-
> > join-btree-sidx1-indexonly-plan
> >
> >
> > *2. Going forward,* can we possibly limit overall paths to, say, 200
> > characters, or even 220 characters? And maybe have a SonarQube or other
> > commit-validation process to prevent longer paths from going in?
> >
> > Appreciate your immediate attention to at least point #1 above!
> >
> > Thanks,
> > Ceej
> > aka Chris Hillery
> >
>


Re: Logging.properties

2018-01-09 Thread Xikui Wang
A small note for IntelliJ users.

The '-Dlog4j.configurationFile' in VM options in the Unit Test dialog will
be overwritten by the surefire argLine parameters by default (without
asking :( ). To avoid this, you need to uncheck the 'argLine' box under
Preferences -> Build, Execution, Deployment -> Build Tools -> Maven ->
Running Tests. Or you can simply change the content of
'asterix-app/src/test/resources/log4j2-test.xml' for debugging purpose. :)

Best,
Xikui

On Thu, Dec 21, 2017 at 12:32 PM, Taewoo Kim  wrote:

> Update:
>
> Murthada's method works like a charm. For those folks who want to see
> Algebricks optimization details (before and after), the following is what
> you need to add in  section of log4j2-test.xml file. Please note
> that the name is "org.apache.hyracks.algebricks", not
> "org.apache.hyracks.algebricks.level". Thanks again @Murthada.
>
> 
>
>   
>
> 
>
>
>
> Best,
> Taewoo
>
> On Wed, Dec 20, 2017 at 5:22 PM, Taewoo Kim  wrote:
>
> > @Murtadha: forgot to reply. Thank you so much!
> >
> > Best,
> > Taewoo
> >
> > On Tue, Dec 19, 2017 at 11:53 PM, Murtadha Hubail 
> > wrote:
> >
> >> Hi Taewoo,
> >>
> >> The new argument to set is -Dlog4j.configurationFile and you need to
> >> provide a log4j2 compatible configuration file. It is more or less
> similar
> >> to logging.properties.
> >> You can check [1] for more details about the configuration. We already
> >> have a configuration file that you can use and modify under
> >> asterix-app/src/test/resources/log4j2-test.xml. Changing the
> >> configuration there should reflect the changes
> >> on the tests. One thing to note is that log4j2 log levels are different
> >> than java logging. You can check [2] for the mapping between the old and
> >> the new levels.
> >>
> >> Cheers,
> >> Murtadha
> >>
> >> [1] https://logging.apache.org/log4j/2.0/manual/configuration.html
> >> [2] https://logging.apache.org/log4j/2.0/log4j-jul/index.html
> >>
> >> On 12/20/2017, 10:38 AM, "Taewoo Kim"  wrote:
> >>
> >> Hello All,
> >>
> >> Not long time ago, for each test suite (e.g., OptimizerTest), we can
> >> provide a custom log level property file (logging.properties) as a
> VM
> >> option
> >> (e.g., -Djava.util.logging.config.file=/.../asterixdb/asterixdb/ast
> >> erix-app/src/test/resources/logging.properties)
> >> and customize logging level for each phase (Hyracks, Algebricks, and
> >> so
> >> on). For example, if I set "org.apache.hyracks.algebricks.level =
> >> FINE",
> >> then only logging level for Algebricks is changed to FINE. It seems
> >> that
> >> this method doesn't work anymore. Could somebody tell me how we
> could
> >> set a
> >> custom logging level for each phase? Thanks!
> >>
> >> Best,
> >> Taewoo
> >>
> >>
> >>
> >>
> >
>


Re: [VOTE] Release Apache AsterixDB 0.9.3 and Hyracks 0.3.3 (RC0)

2018-01-04 Thread Xikui Wang
+1

Downloaded and verified hashes.

Best,
Xikui

On Thu, Jan 4, 2018 at 1:34 PM, Heri Ramampiaro  wrote:

> +1
> -heri
>
> > On Jan 4, 2018, at 19:01, Taewoo Kim  wrote:
> >
> > Bumped. :-)
> >
> > Best,
> > Taewoo
> >
> >> On Thu, Dec 21, 2017 at 2:15 PM, Ian Maxon  wrote:
> >>
> >> Ah, yes those should be filtered out, but it's just an un-necessary
> >> appendix. Doesn't harm anything.
> >>
> >> On Thu, Dec 21, 2017 at 11:42 AM, Wail Alkowaileet 
> >> wrote:
> >>> One thing I'm not sure about, there are pom.xml.versionsBackup in
> >> AsterixDB
> >>> modules.
> >>>
> >>> On Thu, Dec 21, 2017 at 11:39 AM, Wail Alkowaileet  >
> >>> wrote:
> >>>
>  +1
>  Downloaded
>  Verified signatures and hashes
>  Verified source build + ran unit tests and integration tests
> 
>  On Thu, Dec 21, 2017 at 11:11 AM, Taewoo Kim 
> >> wrote:
> 
> > +1
> >
> > Downloaded
> > Verified signatures and hashes
> > Verified the source build
> > Ran a local sample cluster and issued some queries
> >
> > PS:
> > https://cwiki.apache.org/confluence/display/ASTERIXDB/Releas
> > e+Verification
> > will be helpful to do these. :-)
> >
> >
> > Best,
> > Taewoo
> >
> > On Wed, Dec 20, 2017 at 10:14 PM, Mike Carey 
> >> wrote:
> >
> >> +1
> >>
> >> Downloaded and ran the local version - walked through the tutorial
> >> docs
> >> (the SQL++ primer) - found and filed one issue in those that we
> >> should
> > go
> >> ahead and (finally :-)) fix.
> >>
> >>
> >>
> >>> On 11/20/17 5:07 PM, Ian Maxon wrote:
> >>>
> >>> Hi everyone,
> >>>
> >>> Please verify and vote on the 4th release of Apache AsterixDB
> >>>
> >>> The change that produced this release and the change to advance the
> >>> version are
> >>> up for review here:
> >>>
> >>> https://asterix-gerrit.ics.uci.edu/#/c/2170/
> >>> https://asterix-gerrit.ics.uci.edu/#/c/2171
> >>>
> >>> To check out the release, simply fetch the review and check out the
> >>> fetch head like so:
> >>>
> >>> git fetch https://asterix-gerrit.ics.uci.edu:29418/asterixdb
> >>> refs/changes/70/2070/1 && git checkout FETCH_HEAD
> >>>
> >>>
> >>> AsterixDB Source
> >>> https://dist.apache.org/repos/dist/dev/asterixdb/apache-aste
> >>> rixdb-0.9.3-source-release.zip
> >>> https://dist.apache.org/repos/dist/dev/asterixdb/apache-aste
> >>> rixdb-0.9.3-source-release.zip.asc
> >>> https://dist.apache.org/repos/dist/dev/asterixdb/apache-aste
> >>> rixdb-0.9.3-source-release.zip.sha1
> >>>
> >>> SHA1:52ecae081b5d4ef8e7cabcd6531471c408a0a7ac
> >>>
> >>> Hyracks Source
> >>> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyra
> >>> cks-0.3.3-source-release.zip
> >>> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyra
> >>> cks-0.3.3-source-release.zip.asc
> >>> https://dist.apache.org/repos/dist/dev/asterixdb/apache-hyra
> >>> cks-0.3.3-source-release.zip.sha1
> >>>
> >>> SHA1:1457b140e61a11da8caa6da75cbeba7c553371de
> >>>
> >>> AsterixDB NCService Installer:
> >>> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-ser
> >>> ver-0.9.3-binary-assembly.zip
> >>> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-ser
> >>> ver-0.9.3-binary-assembly.zip.asc
> >>> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-ser
> >>> ver-0.9.3-binary-assembly.zip.sha1
> >>>
> >>> SHA1:f05574389ac10a7da9696b4435ad53a5f6c0053a
> >>>
> >>> AsterixDB Managix Installer
> >>> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-ins
> >>> taller-0.9.3-binary-assembly.zip
> >>> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-ins
> >>> taller-0.9.3-binary-assembly.zip.asc
> >>> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-ins
> >>> taller-0.9.3-binary-assembly.zip.sha1
> >>>
> >>> SHA1:ef308b80441ac2c9437f9465dab3decd35b30189
> >>>
> >>> AsterixDB YARN Installer
> >>> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-yar
> >>> n-0.9.3-binary-assembly.zip
> >>> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-yar
> >>> n-0.9.3-binary-assembly.zip.asc
> >>> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-yar
> >>> n-0.9.3-binary-assembly.zip.sha1
> >>>
> >>> SHA1:e85c09dd8ff18902503868626bdee301184e4310
> >>>
> >>> Additionally, a staged maven repository is available at:
> >>>
> >>> https://repository.apache.org/content/repositories/orgapache
> > asterix-1036/
> >>>
> >>> The KEYS file containing the PGP keys used to sign the release can
> >> be
> >>> found at
> >>>
> >>> 

Re: MultiTransactionJobletEventListenerFactory

2017-11-17 Thread Xikui Wang
>>>>> Back to my question, how were you planning to change the
> transaction
> > >> id
> > >>>> if
> > >>>>>> we forget about the case with multiple datasets (feed job)?
> > >>>>>>
> > >>>>>>
> > >>>>>>> On Nov 17, 2017, at 10:38 AM, Steven Jacobs <sjaco...@ucr.edu>
> > >> wrote:
> > >>>>>>>
> > >>>>>>> Maybe it would be good to have a meeting about this with all
> > >> interested
> > >>>>>>> parties?
> > >>>>>>>
> > >>>>>>> I can be on-campus at UCI on Tuesday if that would be a good day
> to
> > >>>> meet.
> > >>>>>>>
> > >>>>>>> Steven
> > >>>>>>>
> > >>>>>>> On Fri, Nov 17, 2017 at 9:36 AM, abdullah alamoudi <
> > >> bamou...@gmail.com
> > >>>>>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> Also, was wondering how would you do the same for a single
> dataset
> > >>>>>>>> (non-feed). How would you get the transaction id and change it
> > when
> > >>>> you
> > >>>>>>>> re-run?
> > >>>>>>>>
> > >>>>>>>> On Nov 17, 2017 7:12 AM, "Murtadha Hubail" <hubail...@gmail.com
> >
> > >>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> For atomic transactions, the change was merged yesterday. For
> > >> entity
> > >>>>>>>> level
> > >>>>>>>>> transactions, it should be a very small change.
> > >>>>>>>>>
> > >>>>>>>>> Cheers,
> > >>>>>>>>> Murtadha
> > >>>>>>>>>
> > >>>>>>>>>> On Nov 17, 2017, at 6:07 PM, abdullah alamoudi <
> > >> bamou...@gmail.com>
> > >>>>>>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>> I understand that is not the case right now but what you're
> > >> working
> > >>>>>> on?
> > >>>>>>>>>>
> > >>>>>>>>>> Cheers,
> > >>>>>>>>>> Abdullah.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>> On Nov 17, 2017, at 7:04 AM, Murtadha Hubail <
> > >> hubail...@gmail.com>
> > >>>>>>>>> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>> A transaction context can register multiple primary indexes.
> > >>>>>>>>>>> Since each entity commit log contains the dataset id, you can
> > >>>>>>>> decrement
> > >>>>>>>>> the active operations on
> > >>>>>>>>>>> the operation tracker associated with that dataset id.
> > >>>>>>>>>>>
> > >>>>>>>>>>> On 17/11/2017, 5:52 PM, "abdullah alamoudi" <
> > bamou...@gmail.com>
> > >>>>>>>> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>> Can you illustrate how a deadlock can happen? I am anxious to
> > >> know.
> > >>>>>>>>>>> Moreover, the reason for the multiple transaction ids in
> feeds
> > is
> > >>>>>>>> not
> > >>>>>>>>> simply because we compile them differently.
> > >>>>>>>>>>>
> > >>>>>>>>>>> How would a commit operator know which dataset active
> operation
> > >>>>>>>>> counter to decrement if they share the same id for example?
> > >>>>>>>>>>>
> > >>>>>>>>>>>> On Nov 16, 2017, at 9:46 PM, Xikui Wang <xik...@uci.edu>
> > wrote:
> > >>>>>>>>>>>>
> > >>>>>>>>&

Support for user-defined-function on feeds

2017-10-21 Thread Xikui Wang
Hi Devs,

With the merge of the patch in [1], we no longer support attaching AQL
user-defined-function to feed. AsterixDB now returns an "incompatible
function language" exception for this case.

Meanwhile, we start to support adding SQLPP user-defined-function to feed.
If you encounter any problems with using it, please file an issue and let
me know. Thanks.

[1] https://asterix-gerrit.ics.uci.edu/#/c/2059/

Best,
Xikui


Re: Monitoring stream

2017-10-11 Thread Xikui Wang
Hi Kristoffer,

In the context of data feeds, there is a servlet that we created for
monitoring ingestion status. In localhost case, you can use following REST
API to access the incoming records number and the failed-at-parser records
number. When there are no active feeds, this returns an empty JSON object.

http://127.0.0.1:19002/admin/active

Best,
Xikui

On Wed, Oct 11, 2017 at 9:31 AM, Steven Jacobs  wrote:

> Hi Kristoffer,
> We don't explicitly have "streams" in Asterix, but we have a way to monitor
> data activity through what we call "channels" which are similar to
> continuous queries. We have a publication describing channels here:
> https://dl.acm.org/citation.cfm?id=2933313. I can also provide you with
> more information on channels, including how to create clusters that can run
> them (we have an extension codebase to use them) depending on what your
> interests are.
>
> Steven
>
> On Wed, Oct 11, 2017 at 2:12 AM, Kristoffer Finckenhagen <
> kristoffer.finckenha...@gmail.com> wrote:
>
> > Hi!
> >
> > My name's Kristoffer, and I'm a student of Heri Ramampiaro. I was
> wondering
> > if there was a way to monitor stream activity in asterixdb. Heri
> mentioned
> > there was supposed to be support for this, but neither of us could find
> > anything in the documentation.
> >
> > Any help would be appreciated
> >
> > Kristoffer Finckenhagen
> > --
> > Mvh Kristoffer Finckenhagen
> >
>


Deprecating '#' as line comment character

2017-09-29 Thread Xikui Wang
Hi Devs,

With the patch in [1], we will no longer support using '#'  for line
comment. You may use '--' or '//' as alternatives. Meanwhile, backquoting
function name for invoking external UDF in SQLPP is not enforced now. You
can use either

testlib#getCapital("England") or `testlib#getCapital`("England")

to invoke your external UDF.

[1] https://asterix-gerrit.ics.uci.edu/#/c/2037/

Best,
Xikui


Re: Time to deprecate AQL?

2017-09-07 Thread Xikui Wang
+1!

Best,
Xikui

> On Sep 7, 2017, at 11:49, Gerald Sangudi  wrote:
> 
> :-)
> 
> On Sep 7, 2017 11:44 AM, "Michael Carey"  wrote:
> 
> As AsterixDB evolves, and additional features are added - e.g., DISTINCT
> aggregate support, or properly implemented query-bodied functions,
> supporting two query languages is hugely expensive:  Updating two grammars,
> parsers, rules, tests, ... IMO it is time to let go of AQL as an externally
> supported interface to AsterixDB and only have SQL++ going forward.  I
> think "everyone" has migrated - and if not we should force that migration.
> (Cloudberry is on SQL++ nowadays, BAD is on SQL++ nowadays, ...)  Any
> objections?  If not, I think we should make this decision officially and
> stop putting energy into carrying the AQL legacy around with us.  Thoughts?


Re: asterix-gerrit-source-assemblies build error

2017-07-04 Thread Xikui Wang
Hi Taewoo,

I have seen this error from time to time. It seems there are too many open
files on that machine. Retrigger that build will solve the problem.

Best,
Xikui

On Tue, Jul 4, 2017 at 2:31 PM, Taewoo Kim  wrote:

> Hi all,
>
> Has anyone seen the following error in asterix-gerrit-source-assemblies
> and
> know how to fix this? The file exists in the directory.
>
> https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-
> source-assemblies/558/
>
> [ERROR] Failed to execute goal
> org.apache.maven.plugins:maven-assembly-plugin:2.6:single
> (source-release-assembly) on project apache-asterixdb: Failed to
> create assembly: Error creating assembly archive source-release:
> Problem creating zip:
> /mnt/data/sde/asterix/workspace/asterix-gerrit-source-assemblies/checkout/
> asterixdb/./asterix-app/src/test/resources/optimizerts/
> results/open-index-non-enforced/btree-index-non-enforced/btree-index-non-
> enforced-09.plan
> (Too many open files) -> [Help 1]
>
> Best,
> Taewoo
>


Re: Parallel feed ingestion

2017-05-17 Thread Xikui Wang
Hi,

Firstly, 3) won't work well as the socket server inside of AsterixDB takes
connection
from client side one at a time. The thing you will observe while having two
clients sending
data to one socket simultaneously is, the 1st client will go through and
the 2nd will be
blocked after several hundreds records. This will continue until the 1st
one finishes.

The comparison between 1) and 2) is interesting. (@Abdullah please correct
me if I'm wrong.)
IMO, 1) achieves parallelism at the operator level by having intake
operator
running on designated nodes simultaneously. 2) achieves that at job level
by simply
putting up several jobs which run independently. I think 1) may have less
overhead
compared to 2), since part of the workflow that can be shared is duplicated
multiple times in 2).
It would be useful to see how these two performs in saturated conditions.

Best,
Xikui

On Wed, May 17, 2017 at 12:11 PM, Mike Carey  wrote:

> @Xikui?  @Abdullah?
>
>
>
> On 5/17/17 11:40 AM, Ildar Absalyamov wrote:
>
>> In light of Steven’s discussion about feeds in parallel thread I was
>> wondering what would be a correct way to push parallel ingestion as far as
>> possible in multinode\multipartition environment.
>> In one of my experiments I am trying to saturate the ingestion to see the
>> effect of computing stats in background.
>> Several things I’ve tried:
>> 1) Open a socket adapter on all NC:
>> create feed Feed using socket_adapter
>> (
>>  ("sockets”="NC1:10001,NC2:10001,…”),
>> …)
>>
>> 2) Connect several Feeds to a single dataset.
>> create feed Feed1 using socket_adapter
>> (
>>  ("sockets”="NC1:10001”),
>> …)
>> create feed Feed2 using socket_adapter
>> (
>>  ("sockets”="NC2:10001”),
>> …)
>>
>> 3) Have several nodes sending data into a single socket.
>>
>> In my previous experiments the parallelization did not quite show that
>> the bottleneck was on the sender side, but I am wondering if that will
>> still be the case, since a lot of things happened under the hood since the
>> last time.
>>
>> Best regards,
>> Ildar
>>
>
>


Re: Having problem with Gerrit and Jenkins

2017-04-02 Thread Xikui Wang
Update: I tried "mvn test" locally and it ran successfully. I ended up with
creating a new branch with the same changes. Submitting this branch solved
the compilation error. :) Thanks for all your help.

On Sat, Apr 1, 2017 at 11:14 AM, Xikui Wang <xik...@uci.edu> wrote:

> Hi Abdullah, Thanks! Running the updated the tests now.
>
> btw, I noticed that my build is running on cb-jenkins-6
> <https://asterix-jenkins.ics.uci.edu/computer/cb-jenkins-6>, but other
> active patches on gerrit are on docker. Could this be a possible reason?
>
> Best,
> Xikui
>
> On Sat, Apr 1, 2017 at 11:06 AM, abdullah alamoudi <bamou...@gmail.com>
> wrote:
>
>> P.S
>>
>> If anyone runs this test locally, it will pass the first time (assuming
>> target is clean) and then will fail until target is cleaned. Somehow, this
>> test/build was done on a non-clean target. mblow or imaxon might have an
>> idea as to why.
>>
>> Cheers,
>> Abdullah.
>>
>> > On Apr 1, 2017, at 11:02 AM, abdullah alamoudi <bamou...@gmail.com>
>> wrote:
>> >
>> > Xikui,
>> > The failure in the BufferCacheRegressionTest is a false positive I
>> think. There is a bug in the test that I have fixed in
>> https://asterix-gerrit.ics.uci.edu/#/c/1619/ <
>> https://asterix-gerrit.ics.uci.edu/#/c/1619/> but that change is yet to
>> be reviewed..
>> > You can pick the fix from there.
>> >
>> > Cheers,
>> > Abdullah.
>> >
>> >> On Apr 1, 2017, at 10:56 AM, Xikui Wang <xik...@uci.edu > xik...@uci.edu>> wrote:
>> >>
>> >> Hi Till,
>> >>
>> >> I went through the log on Jenkins. The "Address already in use"
>> exception
>> >> firstly occurs in hyracks-client package. That works fine locally when
>> I
>> >> run "mvn test" on my machine. But my local test failed at
>> >> "hyracks-storage-common-test" which I think is not related to my
>> change as
>> >> well...
>> >>
>> >> Failed tests:
>> >>
>> >> BufferCacheRegressionTest.testFlushBehaviorOnFileEviction:
>> 71->flushBehaviorTest:131
>> >> Page 0 of deleted file was fazily flushed in openFile(), corrupting the
>> >> data of a newly created file with the same name.
>> >>
>> >>
>> >> Best,
>> >> Xikui
>> >>
>> >> On Sat, Apr 1, 2017 at 9:53 AM, Till Westmann <ti...@apache.org
>> <mailto:ti...@apache.org>> wrote:
>> >>
>> >>> Hi Xikui,
>> >>>
>> >>> If you look at the failures, you’ll see that the error is
>> >>>
>> >>> java.net.BindException: Address already in use
>> >>>at sun.nio.ch.Net.bind0(Native Method)
>> >>>at sun.nio.ch.Net.bind(Net.java:433)
>> >>>at sun.nio.ch.Net.bind(Net.java:425)
>> >>>at sun.nio.ch.ServerSocketChannel
>> Impl.bind(ServerSocketChannelI
>> >>> mpl.java:223)
>> >>>at io.netty.channel.socket.nio.Ni
>> oServerSocketChannel.doBind(Ni
>> >>> oServerSocketChannel.java:127)
>> >>>at io.netty.channel.AbstractChann
>> el$AbstractUnsafe.bind(Abstrac
>> >>> tChannel.java:554)
>> >>>at io.netty.channel.DefaultChanne
>> lPipeline$HeadContext.bind(Def
>> >>> aultChannelPipeline.java:1258)
>> >>>at io.netty.channel.AbstractChann
>> elHandlerContext.invokeBind(Ab
>> >>> stractChannelHandlerContext.java:512)
>> >>>at io.netty.channel.AbstractChann
>> elHandlerContext.bind(Abstract
>> >>> ChannelHandlerContext.java:497)
>> >>>at io.netty.handler.logging.Loggi
>> ngHandler.bind(LoggingHandler.
>> >>> java:191)
>> >>>at io.netty.channel.AbstractChann
>> elHandlerContext.invokeBind(Ab
>> >>> stractChannelHandlerContext.java:512)
>> >>>at io.netty.channel.AbstractChann
>> elHandlerContext.bind(Abstract
>> >>> ChannelHandlerContext.java:497)
>> >>>at io.netty.channel.DefaultChanne
>> lPipeline.bind(DefaultChannelP
>> >>> ipeline.java:980)
>> >>>at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:
>> 250)
>> >>>at io.netty.bootstrap.AbstractBoo
>> tstrap$2.run(AbstractBootstrap
>> >&g

Re: Having problem with Gerrit and Jenkins

2017-04-01 Thread Xikui Wang
Hi Till,

I went through the log on Jenkins. The "Address already in use" exception
firstly occurs in hyracks-client package. That works fine locally when I
run "mvn test" on my machine. But my local test failed at
"hyracks-storage-common-test" which I think is not related to my change as
well...

Failed tests:

BufferCacheRegressionTest.testFlushBehaviorOnFileEviction:71->flushBehaviorTest:131
Page 0 of deleted file was fazily flushed in openFile(), corrupting the
data of a newly created file with the same name.


Best,
Xikui

On Sat, Apr 1, 2017 at 9:53 AM, Till Westmann <ti...@apache.org> wrote:

> Hi Xikui,
>
> If you look at the failures, you’ll see that the error is
>
> java.net.BindException: Address already in use
> at sun.nio.ch.Net.bind0(Native Method)
> at sun.nio.ch.Net.bind(Net.java:433)
> at sun.nio.ch.Net.bind(Net.java:425)
> at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelI
> mpl.java:223)
> at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(Ni
> oServerSocketChannel.java:127)
> at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(Abstrac
> tChannel.java:554)
> at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(Def
> aultChannelPipeline.java:1258)
> at io.netty.channel.AbstractChannelHandlerContext.invokeBind(Ab
> stractChannelHandlerContext.java:512)
> at io.netty.channel.AbstractChannelHandlerContext.bind(Abstract
> ChannelHandlerContext.java:497)
> at io.netty.handler.logging.LoggingHandler.bind(LoggingHandler.
> java:191)
> at io.netty.channel.AbstractChannelHandlerContext.invokeBind(Ab
> stractChannelHandlerContext.java:512)
> at io.netty.channel.AbstractChannelHandlerContext.bind(Abstract
> ChannelHandlerContext.java:497)
> at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelP
> ipeline.java:980)
> at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:250)
> at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap
> .java:363)
> at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(A
> bstractEventExecutor.java:163)
> at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTas
> ks(SingleThreadEventExecutor.java:418)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:454)
> at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(
> SingleThreadEventExecutor.java:873)
> at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnabl
> eDecorator.run(DefaultThreadFactory.java:144)
> at java.lang.Thread.run(Thread.java:745)
>
> So the test could not start AsterixDB and a probable reason is that a
> previous test was not able to shut down. I would expect that you see the
> same result if you run the tests locally.
>
> Is that the case?
>
> Cheers,
> Till
>
>
> On 1 Apr 2017, at 9:44, Xikui Wang wrote:
>
> Good morning Devs,
>>
>> I am having a problem with one patch [1] on Gerrit and Jenkins. The
>> SonarQube violation detected is not in the patched code, but in the
>> original code. Also, the Jenkins Job keeps failing on test cases which
>> seem
>> to be not related to the change [2]. I have tried to abandon and resubmit
>> it as a new patch, but the problem remains. Has anyone had similar issue
>> before? Thanks!
>>
>> Best,
>> Xikui
>>
>>
>> [1] https://asterix-gerrit.ics.uci.edu/#/c/1648/
>> [2] https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/4916/
>>
>


Having problem with Gerrit and Jenkins

2017-04-01 Thread Xikui Wang
Good morning Devs,

I am having a problem with one patch [1] on Gerrit and Jenkins. The
SonarQube violation detected is not in the patched code, but in the
original code. Also, the Jenkins Job keeps failing on test cases which seem
to be not related to the change [2]. I have tried to abandon and resubmit
it as a new patch, but the problem remains. Has anyone had similar issue
before? Thanks!

Best,
Xikui


[1] https://asterix-gerrit.ics.uci.edu/#/c/1648/
[2] https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/4916/


Re: TwitterFirehoseStream feed parameters

2017-03-13 Thread Xikui Wang
Aha! @Steven, Thanks for pointing that out. Sent to Thor directly.

For others who are interested in TwitterFirehoseStream, test case feeds_07
and feeds_08 are examples for it.

Best,
Xikui

On Mon, Mar 13, 2017 at 3:03 PM, Steven Jacobs <sjaco...@ucr.edu> wrote:

> @Xikui I think the attachments get stripped when you send to dev, maybe
> send to Thor directly.
> Steven
>
> On Mon, Mar 13, 2017 at 2:59 PM, Xikui Wang <xik...@uci.edu> wrote:
>
> > Hi Thor:
> >
> > Here is an example of the full lifecycle of TwitterFirehoseStream. Let me
> > know if you have any questions about this.
> >
> > Best,
> > Xikui
> >
> > On Mon, Mar 13, 2017 at 2:25 PM, Mike Carey <dtab...@gmail.com> wrote:
> >
> >> @Xikui:  Can you reply?  Thx!
> >>
> >>
> >>
> >> On 3/13/17 2:15 PM, Thor Martin Abrahamsen wrote:
> >>
> >>> Does anyone know which feed parameteres are needed to instantiate the
> >>> TwitterFirehoseStream[1]? (the one which generates generic Tweets).
> >>>
> >>>
> >>> Best regards
> >>> Thor Martin Abrahamsen
> >>> Student at Norwegian University of Science and Technology
> >>>
> >>> [1] - https://github.com/apache/asterixdb/blob/master/asterixdb/as
> >>> terix-external-data/src/test/java/org/apache/asterix/extern
> >>> al/input/stream/TwitterFirehoseStreamFactory.java
> >>>
> >>>
> >>>
> >>
> >
>


Re: TwitterFirehoseStream feed parameters

2017-03-13 Thread Xikui Wang
Hi Thor:

Here is an example of the full lifecycle of TwitterFirehoseStream. Let me
know if you have any questions about this.

Best,
Xikui

On Mon, Mar 13, 2017 at 2:25 PM, Mike Carey  wrote:

> @Xikui:  Can you reply?  Thx!
>
>
>
> On 3/13/17 2:15 PM, Thor Martin Abrahamsen wrote:
>
>> Does anyone know which feed parameteres are needed to instantiate the
>> TwitterFirehoseStream[1]? (the one which generates generic Tweets).
>>
>>
>> Best regards
>> Thor Martin Abrahamsen
>> Student at Norwegian University of Science and Technology
>>
>> [1] - https://github.com/apache/asterixdb/blob/master/asterixdb/
>> asterix-external-data/src/test/java/org/apache/asterix/
>> external/input/stream/TwitterFirehoseStreamFactory.java
>>
>>
>>
>


Re: [VOTE] Release Apache AsterixDB 0.9.0 and Hyracks 0.3.0 (RC2)

2017-01-19 Thread Xikui Wang
+1

Verified twitter adaptor by dropping twitter4j library into the repo
directory and repack
server-assembly with twitter4j library.

Best,
Xikui

On Thu, Jan 19, 2017 at 3:00 PM, Mike Carey  wrote:

> +1 for this release
>
> Successfully downloaded and started the system and did some SQL++ tutorial
> examples using the NCService binary installer.  Worked like a charm!
>
> Cheers,
>
> Mike
>
>
>
> On 1/18/17 7:50 PM, Ian Maxon wrote:
>
>> Hi again everyone,
>>
>> Please verify and vote on the first non-incubating Apache AsterixDB
>> Release!
>> This 2nd RC addresses build issues noticed in the previous RC, along with
>> some minor license tweaks.
>> This release utilizes a series of improvements around the actual release
>> process that will hopefully shorten the interval between releases. A
>> further email detailing the features contained in this release as compared
>> to the previous incubating release will be forthcoming once a suitable RC
>> passes voting.
>>
>> The tags to be voted on are:
>>
>> apache-asterixdb-0.9.0-rc2
>> commit: 4383bdde78c02d597be65ecf467c5a7df85a2055
>> link:
>> https://git-wip-us.apache.org/repos/asf?p=asterixdb.git;a=ta
>> g;h=refs/tags/apache-asterixdb-0.9.0-rc2
>>
>> and
>>
>> apache-hyracks-0.3.0-rc2
>> commit: def643d586b62b2616b8ab8e6fc3ba598cf5ad67
>> link:
>> https://git-wip-us.apache.org/repos/asf?p=asterixdb.git;a=ta
>> g;h=refs/tags/apache-hyracks-0.3.0-rc2
>>
>> The artifacts, sha1's, and signatures are (for each artifact), are at:
>>
>> AsterixDB Source
>> https://dist.apache.org/repos/dist/dev/asterixdb/apache-
>> asterixdb-0.9.0-source-release.zip
>> https://dist.apache.org/repos/dist/dev/asterixdb/apache-
>> asterixdb-0.9.0-source-release.zip.asc
>> https://dist.apache.org/repos/dist/dev/asterixdb/apache-
>> asterixdb-0.9.0-source-release.zip.sha1
>>
>> SHA1: 49f8df822c6273a310027d3257a79afb45c8d446
>>
>> Hyracks Source
>> https://dist.apache.org/repos/dist/dev/asterixdb/apache-
>> hyracks-0.3.0-source-release.zip
>> https://dist.apache.org/repos/dist/dev/asterixdb/apache-
>> hyracks-0.3.0-source-release.zip.asc
>> https://dist.apache.org/repos/dist/dev/asterixdb/apache-
>> hyracks-0.3.0-source-release.zip.sha1
>>
>> SHA1: 4d042cab164347f0cc5cc1cfb3da8d4f02eea1de
>>
>> AsterixDB NCService Installer:
>> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-
>> server-0.9.0-binary-assembly.zip
>> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-
>> server-0.9.0-binary-assembly.zip.asc
>> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-
>> server-0.9.0-binary-assembly.zip.sha1
>>
>> SHA1: 46c4cc3dc09e915d4b1bc6f912faef389488fdb6
>>
>> AsterixDB Managix Installer
>> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-
>> installer-0.9.0-binary-assembly.zip
>> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-
>> installer-0.9.0-binary-assembly.zip.asc
>> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-
>> installer-0.9.0-binary-assembly.zip.sha1
>>
>> SHA1: 41497dbadb0ad281ba0a10ee87eaa5f7afa78cef
>>
>> AsterixDB YARN Installer
>> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-
>> yarn-0.9.0-binary-assembly.zip
>> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-
>> yarn-0.9.0-binary-assembly.zip.asc
>> https://dist.apache.org/repos/dist/dev/asterixdb/asterix-
>> yarn-0.9.0-binary-assembly.zip.sha1
>>
>> SHA1: 3ade0d2957e7f3e465e357aced6712ef72598613
>>
>> Additionally, a staged maven repository is available at:
>>
>> https://repository.apache.org/content/repositories/orgapacheasterix-1024/
>>
>> The KEYS file containing the PGP keys used to sign the release can be
>> found at
>>
>> https://dist.apache.org/repos/dist/release/asterixdb/KEYS
>>
>> RAT was executed as part of Maven via the RAT maven plugin, but
>> excludes files that are:
>>
>> - data for tests
>> - procedurally generated,
>> - or source files which come without a header mentioning their license,
>>but have an explicit reference in the LICENSE file.
>>
>>
>> The vote is open for 72 hours, or until the necessary number of votes
>> (3 +1) has been reached.
>>
>> Please vote
>> [ ] +1 release these packages as Apache AsterixDB 0.9.0 and
>> Apache Hyracks 0.3.0
>> [ ] 0 No strong feeling either way
>> [ ] -1 do not release one or both packages because ...
>>
>> Thanks!
>>
>>
>


Re: Dependency Error with no file reference

2017-01-17 Thread Xikui Wang
I had the same issue. I think this is because the recently introduced
maven-dependency-plugin
detected dependency issue. If you rerun the command with -X option, ie. mvn
install -DskipTests -rf:asterix-bad
You will see the detailed information about which dependency is declared in
your pom.xml but
not used. Remove that will resolve the issue.

My followup question on this, the maven-dependency-plugin seems not working
very well with
'non-intermediate' dependency. In my case, I replicated the executionTest
in a new package.
Adding the dependency that is not 'intermediate' will trigger the
dependency detection problem.
Removing them will cause class-not-found issue. Any suggestions? Probably
we can use the same
executionTest for all test scripts in different packages? I guess this
problem also exists in
the bad package...

Best,
Xikui

On Tue, Jan 17, 2017 at 3:40 PM, Steven Jacobs  wrote:

> Hi all,
> I'm seeing a strange build error recently with dependencies. I am currently
> working on a branch of the BAD project, and none of my file changes are in
> the poms at all. Yet I get a cryptic build failure with the following
> message:
>
> "[ERROR] Failed to execute goal
> org.apache.maven.plugins:maven-dependency-plugin:2.10:analyze-only
> (default) on project asterix-bad: Dependency problems found -> [Help 1]"
>
> And no other information on what the "dependency problems" are. Has anyone
> ever seen such an issue? I am currently lost with trying to resolve this.
> The full output is below. There are a few other messages that seem weird,
> but none of them seem to be related to the dependency error.
>
> [INFO] Building asterix-bad 1.0.0-SNAPSHOT
> [INFO]
> 
> [INFO]
> [INFO] --- maven-clean-plugin:3.0.0:clean (default-clean) @ asterix-bad ---
> [INFO] Deleting
> /Users/stevenjacobs/asterix/asertixdb/asterixdb/asterix-
> opt/asterix-bad/target
> [INFO]
> [INFO] --- maven-enforcer-plugin:1.4.1:enforce (enforce-versions) @
> asterix-bad ---
> [INFO]
> [INFO] --- asterix-grammar-extension-maven-plugin:0.8.9-SNAPSHOT:grammarix
> (default) @ asterix-bad ---
> [INFO] Base dir:
> /Users/stevenjacobs/asterix/asertixdb/asterixdb/asterix-opt/asterix-bad
> [INFO] Grammar-base: ../../asterix-lang-aql/src/main/javacc/AQL.jj
> [INFO] Grammar-extension: src/main/resources/lang-extension/lang.txt
> [INFO] Output: target/generated-resources/javacc/grammar.jj
> [INFO]
> [INFO] --- javacc-maven-plugin:2.6:javacc (javacc) @ asterix-bad ---
> Java Compiler Compiler Version 5.0 (Parser Generator)
> (type "javacc" with no arguments for help)
> Reading from file
> /Users/stevenjacobs/asterix/asertixdb/asterixdb/asterix-
> opt/asterix-bad/target/generated-resources/javacc/grammar.jj
> . . .
> Warning: Choice conflict in (...)* construct at line 869, column 5.
>  Expansion nested within construct and expansion following
> construct
>  have common prefixes, one of which is: "+"
>  Consider using a lookahead of 2 or more for nested expansion.
> Warning: Choice conflict in [...] construct at line 1120, column 27.
>  Expansion nested within construct and expansion following
> construct
>  have common prefixes, one of which is: "."
>  Consider using a lookahead of 2 or more for nested expansion.
> Warning: Choice conflict in [...] construct at line 1527, column 3.
>  Expansion nested within construct and expansion following
> construct
>  have common prefixes, one of which is: "("
>  Consider using a lookahead of 2 or more for nested expansion.
> Warning: Choice conflict in (...)* construct at line 1819, column 24.
>  Expansion nested within construct and expansion following
> construct
>  have common prefixes, one of which is: "["
>  Consider using a lookahead of 2 or more for nested expansion.
> Warning: Choice conflict in [...] construct at line 2108, column 75.
>  Expansion nested within construct and expansion following
> construct
>  have common prefixes, one of which is: "("
>  Consider using a lookahead of 2 or more for nested expansion.
> Warning: Choice conflict in [...] construct at line 2150, column 75.
>  Expansion nested within construct and expansion following
> construct
>  have common prefixes, one of which is: "("
>  Consider using a lookahead of 2 or more for nested expansion.
> Warning: Choice conflict in [...] construct at line 2151, column 5.
>  Expansion nested within construct and expansion following
> construct
>  have common prefixes, one of which is: "with"
>  Consider using a lookahead of 2 or more for nested expansion.
> Warning: Choice conflict in [...] construct at line 2509, column 3.
>  Expansion nested within construct and expansion following
> construct
>  have common prefixes, one of which is: "with"
>  Consider using a 

Re: Orderedlist vs. unorderedlist as default open type

2016-11-03 Thread Xikui Wang
That's an implementation mistake in TweetParser. Sorry about that. Will
submit a fix soon.

On Thu, Nov 3, 2016 at 10:17 AM, Mike Carey  wrote:

> This seems odd to me too - in the absence of schema, since JSON only has
> ordered lists, it seems like that would be the natural default.
>
>
>
> On 11/3/16 4:11 AM, Wail Alkowaileet wrote:
>
>> Dears,
>>
>> Currently, unordered list is the default type of JSON array if it resides
>> in the open part.
>> That means the user won't be able to access any item of the list using
>> index. Which is unexpected. At least for my colleagues who use AsterixDB.
>> I
>> think only JSON types should appear in the open part.
>>
>> Also, I believe there's inconsistency. When we do *group by .. with ..*
>> the result of "with" clause is ordered list.
>>
>> Any thoughts ?
>>
>>
>


Re: [jira] [Commented] (ASTERIXDB-1694) Fail running Tweet Feed on Cluster of 16 nodes (while succeed on 4 nodes)

2016-10-21 Thread Xikui Wang
Hi Devs,

I'd like to put note on this problem, in case anyone hit this problem again
or have an insight of what's causing the problem.

Basically, this problem is caused by the initialization fail of Log4J in
Twitter4j(so not our problem :D). The solution is enclosed in Mingda's last
reply.
However, the cause for this problem is still unclear to us, especially the
case that it works on 4 nodes but not on 16 nodes.
Code snippet [1] shows how Twitter4J creates logger. It scans all possible
libraries and found the one that is available first. The feed adaptor
in AsterixDB will run on only one of nodes in cluster. According to Mingda,
if we shutdown the node with log problem, the new node that
adaptor running on will have the same problem.

As for a permanent solution, probably I can turn off the Twitter4j logger
in code, or hardcode the configuration to avoid this problem in the future.

If anyone have better idea, please let me know. Thanks! :)

[1]
https://github.com/yusuke/twitter4j/blob/4ebca9da71b271775624b11b5197af99a57bf175/twitter4j-core/src/internal-logging/java/twitter4j/Logger.java

On Wed, Oct 19, 2016 at 8:17 PM, mingda li <limingda1...@gmail.com> wrote:

> Dear all,
> Good news!
> For the official version of AsterixDB, the datafeed problem for Twitter can
> be solved for 16 nodes by adding a log4j.properties
> to asterix-server-0.8.9-SNAPSHOT-binary-assembly/etc and /repo. I will try
> Wail's version. And see why this can work for 4 nodes without adding the
> log4j.properties file.
>
> BTW, the log4j.properties file is as following, if you may need someday:
> # Set root logger level to DEBUG and its only appender to A1.
> log4j.rootLogger=DEBUG, A1
>
> # A1 is set to be a ConsoleAppender.
> log4j.appender.A1=org.apache.log4j.ConsoleAppender
>
> # A1 uses PatternLayout.
> log4j.appender.A1.layout=org.apache.log4j.PatternLayout
> log4j.appender.A1.layout.ConversionPattern=%-4r [%t] %-5p %c %x - %m%n
>
>
>
> On Wed, Oct 19, 2016 at 8:07 PM, mingda li <limingda1...@gmail.com> wrote:
>
> > BTW, I tried to run AsterixDB's official version of Tweet feed and also
> > meet similar problem in the Node 14.
> > I follow Xikui's suggestion to add a log4j.properties
> > in asterix-server-0.8.9-SNAPSHOT-binary-assembly/etc and /repo. I
> checked
> > the log file and find it is changed to the attachment. It seems begin
> >  catch tweet but failed.
> > ​
> >  nc-red15.log
> > <https://drive.google.com/file/d/0B-3JraLWXVVGVVJ0T0VHaVpGYzA/
> view?usp=drive_web>
> > ​
> >
> >
> > On Wed, Oct 19, 2016 at 12:51 PM, mingda li <limingda1...@gmail.com>
> > wrote:
> >
> >> En, that is good suggestion.
> >> Since this is not my version of AsterixDB, we should ask Wail if he have
> >> ever set something related to twitter4j's log.
> >>
> >> @Wail, have you ever set such thing?
> >>
> >> On Tue, Oct 18, 2016 at 6:19 PM, Xikui Wang <xik...@uci.edu> wrote:
> >>
> >>> It looks like the log4j in Twitter4J is not correctly initialized[1].
> Did
> >>> you customize the log4j in Twitter4J configuration in your system like
> >>> this[2]? By default, it's printed to standard output.
> >>>
> >>>
> >>> [1]
> >>> http://activemq.apache.org/log4j-warn-no-appenders-could-be-
> >>> found-for-logger.html
> >>> [2] http://twitter4j.org/en/configuration.html#logger
> >>>
> >>> On Tue, Oct 18, 2016 at 5:42 PM, mingda li <limingda1...@gmail.com>
> >>> wrote:
> >>>
> >>> > Hi,
> >>> >
> >>> > When I start 16 nodes, I found the 15th node has log file different
> >>> from
> >>> > others as following.
> >>> >
> >>> > Oct 18, 2016 5:23:10 PM org.apache.hyracks.control.nc.NCDriver main
> >>> > SEVERE: Setting uncaught exception handler
> >>> org.apache.hyracks.api.lifecyc
> >>> > le.LifeCycleComponentManager@73f792cf
> >>> > Oct 18, 2016 5:23:10 PM org.apache.hyracks.control.nc.
> >>> > NodeControllerService
> >>> > start
> >>> > INFO: Starting NodeControllerService
> >>> > Oct 18, 2016 5:23:10 PM
> >>> > org.apache.asterix.hyracks.bootstrap.NCApplicationEntryPoint
> >>> > start
> >>> > INFO: Starting Asterix node controller: red15
> >>> > log4j:WARN No appenders could be found for logger
> >>> > (twitter4j.TwitterStreamImpl).
> >>> > log4j:WARN Please initialize the log4j system properly.
> >>> > log

Debugging with generated code

2016-09-13 Thread Xikui Wang
Hi Devs,

I am debugging with FieldAccessByIndexEvalFactory, and the break points are
not working here. This may be related to  ASTERIXDB-1459
. Is this fixed in
the code base? Thanks.

Best,
Xikui


Re: External function dependency problem

2016-07-02 Thread Xikui Wang
(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
...Unexpected!

On Sat, Jul 2, 2016 at 7:23 PM, Raman Grover <ramangrove...@gmail.com>
wrote:

> i am missing the attachment
> On Jul 2, 2016 6:39 PM, "Xikui Wang" <xik...@uci.edu> wrote:
>
> > Hi Raman,
> >
> > Thanks for your help. I tried this quick fix on my branch, but it
> > introduces some new exceptions. I think this causes Asterix fails at
> > entering the external function. The error message is attached.
> >
> > Best,
> > Xikui
> >
> > On Fri, Jul 1, 2016 at 10:11 AM, Raman Grover <ramangrove...@gmail.com>
> > wrote:
> >
> >> Operations related to setting up an external library are contained in
> >> ExternalLibraryUtil
> >> <
> >>
> https://github.com/apache/asterixdb/blob/master/asterixdb/asterix-app/src/main/java/org/apache/asterix/app/external/ExternalLibraryUtils.java
> >> >
> >>
> >> At line 382, we have
> >> // create and return the class loader
> >>
> >> ClassLoader classLoader = new URLClassLoader(urls, parentClassLoader);
> >> return classLoader;
> >>
> >> Above, we have the parentClassLoader set to the classloader for
> >> ExternalLibraryUtil which is the application class loader (AsterixDB's
> >> classloader that loads the dependencies from pom.xml). The proposed
> >> solution (a) in earlier thread - skipping application classloader would
> >> translate to replacing the above code with
> >>
> >> ClassLoader classLoader = new URLClassLoader(urls, null);
> >>
> >> Regards,
> >> Raman
> >>
> >> On Thu, Jun 30, 2016 at 4:41 PM, Xikui Wang <xik...@uci.edu> wrote:
> >>
> >> > >
> >> > > Hi Abdullah,
> >> >
> >> > I reverted my code to reproduce the problem. Noticing this external
> >> > function has a couple of other bugs but the dependency one is blocking
> >> > others, so this should be enough to reproduce the problem.
> >> >
> >> > >
> >> > The external function package is geoTag.zip.
> >> >
> >> > > ​
> >> > >
> >> > > ​Test scripts are in tweetGeoTag.zip
> >> >
> >> > External function is loading data from data/, i.e.: data/state.json .
> So
> >> > all json files in data.zip need to be placed under
> >> > ../asterixdb/asterix-app/data/
> >> >
> >> > The real_tweets_adm.adm used in ddl is also attached.
> >> >
> >> > This setting will cause
> >> >
> >> > java.lang.NoSuchMethodError:
> >> > > com.fasterxml.jackson.core.JsonFactory.requiresPropertyOrdering()Z
> >> > > at
> >> > >
> >>
> com.fasterxml.jackson.databind.ObjectMapper.(ObjectMapper.java:541)
> >> > > at
> >> > >
> >>
> com.fasterxml.jackson.databind.ObjectMapper.(ObjectMapper.java:452)
> >> > > at
> org.wololo.geojson.GeoJSONFactory.(GeoJSONFactory.java:17)
> >> > > at
> >> > >
> >> >
> >>
> edu.uci.ics.cloudberry.gnosis.USGeoJSONIndex.loadShape(IGeoIndex.scala:29)
> >> > > at
> >> > >
> >> >
> >>
> edu.uci.ics.cloudberry.gnosis.USGeoGnosis$.loadShape(USGeoGnosis.scala:101)
> >> > > at
> >> > >
> >> >
> >>
> edu.uci.ics.cloudberry.gnosis.USGeoGnosis$$anonfun$load$1.apply(USGeoGnosis.scala:20)
> >> > > at
> >> > >
> >> >
> >>
> edu.uci.ics.cloudberry.gnosis.USGeoGnosis$$anonfun$load$1.apply(USGeoGnosis.scala:18)
> >> > > at
> >> > >
> >> >
> >>
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
> >> > > at
> >> > >
> >> >
> >>
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
> >> > > at scala.collection.immutable.List.foreach(List.scala:381)
> >> > > at
> >> scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
> >> > > at scala.collection.immutable.List.map(List.scala:285)
> >> > > at
> >> edu.uci.ics.cloudberry.gnosis.USGeoGnosis.load(USGeoGnosis.scala:18)
> >

Re: External function dependency problem

2016-06-30 Thread Xikui Wang
B, we provide a custom
>> > implementation of classloader that attempts to load a class prior to
>> > delegating to the parent. This way, the library classes and packaged
>> > dependencies override any system level classes from the class path and
>> even
>> > the classes contained in rt.jar and i18n.jar.
>> >
>> > I am opening the discussion here to suggest further alternatives or
>> provide
>> > preferences.
>> >
>> > I have a preference for (a)  (skipping the system class loader) for two
>> > reasons:
>> >
>> > a) it is simpler
>> >
>> > b) the other option allows a custom class loader to override classes in
>> > rt.jar., which is ok but not how classloaders are supposed to work in
>> > principle.
>> >
>> > Regards,
>> >
>> > Raman
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Jun 29, 2016 11:07 PM, "Mike Carey" <dtab...@gmail.com> wrote:
>> >
>> > > Any classloader experts have suggestions...?
>> > > On Jun 29, 2016 10:26 PM, "Xikui Wang" <xik...@uci.edu> wrote:
>> > >
>> > > > Hi Devs,
>> > > >
>> > > > We found a problem when trying to build external functions for the
>> > > > cloudberry demo.
>> > > >
>> > > > When the external function is depend on certain library, the library
>> > that
>> > > > comes with the external function will be blocked by same library in
>> > > > AsterixDB. In our case, our external function 'geoTag' uses jackson
>> > > v2.7.1,
>> > > > and we packed all dependencies into one single jar. When running
>> > 'geoTag'
>> > > > on Asterix, it will call jackson v2.0.0 in AsterixDB which causes
>> > > > NullPointerException. We have to manually change pom.xml in
>> AsterixDB
>> > to
>> > > > fix that.
>> > > >
>> > > > We are wondering is that because we load the external function in a
>> > wrong
>> > > > way, or this could be one possible interesting problem which is
>> worth
>> > > > noticing. Thanks.
>> > > >
>> > > > Best,
>> > > > Xikui
>> > > >
>> > >
>> >
>>
>
>


External function dependency problem

2016-06-29 Thread Xikui Wang
Hi Devs,

We found a problem when trying to build external functions for the
cloudberry demo.

When the external function is depend on certain library, the library that
comes with the external function will be blocked by same library in
AsterixDB. In our case, our external function 'geoTag' uses jackson v2.7.1,
and we packed all dependencies into one single jar. When running 'geoTag'
on Asterix, it will call jackson v2.0.0 in AsterixDB which causes
NullPointerException. We have to manually change pom.xml in AsterixDB to
fix that.

We are wondering is that because we load the external function in a wrong
way, or this could be one possible interesting problem which is worth
noticing. Thanks.

Best,
Xikui


Re: User Define Function (UDF) in AsterixDB

2016-06-03 Thread Xikui Wang
Hi Heri,

Thanks for sharing the document. It is useful as the general structure of
UDF remains the same.

Best,
Xikui

On Thu, Jun 2, 2016 at 11:34 PM, Heri Ramampiaro <heri...@gmail.com> wrote:

> Xikui,
>
> Enclosed is an instruction based on the older version of feeds and UDF
> that perhaps could help you
> figur out the principle behind installing external libs in AsterixDB
>
> Best,
> -heri
>
>
>
>
> > On Jun 2, 2016, at 23:44, Xikui Wang <xik...@uci.edu> wrote:
> >
> > Hi Abdullah,
> >
> > Thanks for your help. I met an error when I was trying to execute
> 'install
> > externallibtest testlib PATH/TO/testlib-zip-binary-assembly.zip' from the
> > web query interface. Probably I used this in a wrong way?
> >
> > Best,
> > Xikui
> >
> > On Thu, Jun 2, 2016 at 2:26 PM, abdullah alamoudi <bamou...@gmail.com>
> > wrote:
> >
> >> Hi Xikui,
> >> 1. How to install UDF on instance running from Eclipse+
> >> AsterixHyracksIntegrationUtil?
> >>
> >> There are a few external library test cases, you can look at them and
> see
> >> how we test those. One thing you will notice is that we only test a few
> >> examples. Clearly, we can do better. You can find the test cases in:
> >>
> >>
> >>
> asterixdb/asterixdb/asterix-app/src/test/resources/runtimets/queries/external-library
> >>
> >> As for the difference between scalar, aggregate, and unnest functions,
> here
> >> is the way I see it:
> >> 1. Scalar: one input to one output.
> >> 2. Aggregate: 0 or more inputs to one output.
> >> 3. Unnest: one input to 0 or more outputs.
> >>
> >> Hope that helps,
> >> Abdullah.
> >>
> >> On Thu, Jun 2, 2016 at 11:40 PM, Xikui Wang <xik...@uci.edu> wrote:
> >>
> >>> Hi Devs,
> >>>
> >>> I want to use UDF to process the Tweets that I got from the feed, and I
> >> met
> >>> following two questions. Hope you guys can help me or point me to the
> >> right
> >>> documentation.
> >>>
> >>> 1. How to install UDF on instance running from
> >>> Eclipse+AsterixHyracksIntegrationUtil?
> >>>
> >>> Website only mentioned how to install with Managix. I am wondering if
> >> there
> >>> is a way for me to install it on instance running in Eclipse, which is
> >>> easier for debugging.
> >>>
> >>> 2. Implementation of UDF
> >>>
> >>> I found several UDFs in
> >>>
> >>>
> >>
> asterixdb/asterix-external-data/src/test/java/org/apache/asterix/external/library,
> >>> like SumFunction, ParseTweetFunction. I assume if I want to implement
> new
> >>> UDF, it needs to implement IExternalScalarFunction interface and to be
> >> put
> >>> under the same directory? I also found 'aggregate' and 'unnest' type
> >> which
> >>> is not implemented yet. Just out of curiosity, what is the difference
> >>> between them?
> >>>
> >>> Thanks in advance! :)
> >>>
> >>> Best,
> >>> Xikui
> >>>
> >>
>
>
>


Re: User Define Function (UDF) in AsterixDB

2016-06-02 Thread Xikui Wang
Hi Abdullah,

Thanks for your help. I met an error when I was trying to execute 'install
externallibtest testlib PATH/TO/testlib-zip-binary-assembly.zip' from the
web query interface. Probably I used this in a wrong way?

Best,
Xikui

On Thu, Jun 2, 2016 at 2:26 PM, abdullah alamoudi <bamou...@gmail.com>
wrote:

> Hi Xikui,
> 1. How to install UDF on instance running from Eclipse+
> AsterixHyracksIntegrationUtil?
>
> There are a few external library test cases, you can look at them and see
> how we test those. One thing you will notice is that we only test a few
> examples. Clearly, we can do better. You can find the test cases in:
>
>
> asterixdb/asterixdb/asterix-app/src/test/resources/runtimets/queries/external-library
>
> As for the difference between scalar, aggregate, and unnest functions, here
> is the way I see it:
> 1. Scalar: one input to one output.
> 2. Aggregate: 0 or more inputs to one output.
> 3. Unnest: one input to 0 or more outputs.
>
> Hope that helps,
> Abdullah.
>
> On Thu, Jun 2, 2016 at 11:40 PM, Xikui Wang <xik...@uci.edu> wrote:
>
> > Hi Devs,
> >
> > I want to use UDF to process the Tweets that I got from the feed, and I
> met
> > following two questions. Hope you guys can help me or point me to the
> right
> > documentation.
> >
> > 1. How to install UDF on instance running from
> > Eclipse+AsterixHyracksIntegrationUtil?
> >
> > Website only mentioned how to install with Managix. I am wondering if
> there
> > is a way for me to install it on instance running in Eclipse, which is
> > easier for debugging.
> >
> > 2. Implementation of UDF
> >
> > I found several UDFs in
> >
> >
> asterixdb/asterix-external-data/src/test/java/org/apache/asterix/external/library,
> > like SumFunction, ParseTweetFunction. I assume if I want to implement new
> > UDF, it needs to implement IExternalScalarFunction interface and to be
> put
> > under the same directory? I also found 'aggregate' and 'unnest' type
> which
> > is not implemented yet. Just out of curiosity, what is the difference
> > between them?
> >
> > Thanks in advance! :)
> >
> > Best,
> > Xikui
> >
>