) is under
development
--
*Regards,*
Wail Alkowaileet
d policies) of this APE are
not solidified yet, but the core concepts are in place and are ready for
the community's vote :)
EPIC: ASTERIXDB-3373 <https://issues.apache.org/jira/browse/ASTERIXDB-3373>
--
*Regards,*
Wail Alkowaileet
+1
*Regards,*
Wail Alkowaileet
On Fri, Mar 1, 2024 at 13:50 Ian Maxon wrote:
> Hi everyone,
>
> Please verify and vote on the latest stabilization release of Apache
> AsterixDB.
>
> The change that produced this release is up on Gerrit:
>
> https://asterix-gerrit
+1
*Regards,*
Wail Alkowaileet
On Thu, Feb 29, 2024 at 15:25 Ian Maxon wrote:
> Hi everyone,
>
> Please verify and vote on the latest release of Apache AsterixDB.
>
> The change that produced this release is up on Gerrit:
>
> https://asterix-gerrit.ics.uci.ed
+1
On Sat, Dec 2, 2023 at 11:25 Till Westmann wrote:
> +1
>
> > On Dec 2, 2023, at 11:23, Glenn Justo Galvizo wrote:
> >
> > +1 from me as well.
> >
> >> On Dec 2, 2023, at 10:27, Ian Maxon wrote:
> >>
> >> +1
> >>
> On Dec 1, 2023 at 12:28:23, Murtadha Al-Hubail
> wrote:
> >>>
> >>>
re:
> >
> > - data for tests
> > - procedurally generated,
> > - or source files which come without a header mentioning their license,
> > but have an explicit reference in the LICENSE file.
> >
> >
> > The vote is open for 72 hours, or until the necessary number of votes
> > (3 +1) has been reached.
> >
> > Please vote
> > [ ] +1 release these packages as Apache AsterixDB JDBC Driver 0.9.8.2
> > [ ] 0 No strong feeling either way
> > [ ] -1 do not release one or both packages because ...
> >
> > Thanks!
> >
>
--
*Regards,*
Wail Alkowaileet
> +1 -- Now maybe users will stop trying to retrieve huge results and
> wondering why the UI is choking! :-) This capability is actually long
> overdue.
> >
> > On 10/24/23 9:53 AM, Wail Alkowaileet wrote:
> >> Currently, AsterixDB does not have a clean way to extract quer
:
> > >
> > > +1 this is much nicer
> > >
> > >> On 2023/10/26 05:05:01 Mike Carey wrote:
> > >> PS - I assume the semantics will be UPSERT-based? (Vs. one-time or
> > >> INSERT-based?)
> > >>
> > >>> On 10/
onally, the proposed syntax will make both COPY FROM and
COPY TO less different.
Example of COPY TO:
> COPY Customers
> TO localfs
> PATH("localhost:///myData/Customers")
> WITH {
> "format" : "json"
> };
>
--
*Regards,*
Wail Alkowaileet
that this ordering isn't global but per
partition.
Also, the written files will be compressed using *gzip* and each file
should have at most 100 records max (*max-objects-per-file*).
EPIC: ASTERIXDB-3286 <https://issues.apache.org/jira/browse/ASTERIXDB-3286>
--
*Regards,*
Wail Alkowaileet
; > - procedurally generated,
> > - or source files which come without a header mentioning their license,
> > but have an explicit reference in the LICENSE file.
> >
> >
> > The vote is open for 72 hours, or until the necessary number of votes
> > (3 +1) has been reached.
> >
> > Please vote
> > [ ] +1 release these packages as Apache AsterixDB JDBC Driver 0.9.8.1
> > [ ] 0 No strong feeling either way
> > [ ] -1 do not release one or both packages because ...
> >
> > Thanks!
> >
>
--
*Regards,*
Wail Alkowaileet
t;
> >>
> >> The KEYS file containing the PGP keys used to sign the release can be
> >> found at
> >>
> >> https://dist.apache.org/repos/dist/release/asterixdb/KEYS
> >>
> >> RAT was executed as part of Maven via the RAT maven plugin, but
> >> excludes files that are:
> >>
> >> - data for tests
> >> - procedurally generated,
> >> - or source files which come without a header mentioning their
> >> license,
> >> but have an explicit reference in the LICENSE file.
> >>
> >>
> >> The vote is open for 72 hours, or until the necessary number of votes
> >> (3 +1) has been reached.
> >>
> >> Please vote
> >> [ ] +1 release these packages as Apache AsterixDB 0.9.7.1 and
> >> Apache Hyracks 0.3.7.1
> >> [ ] 0 No strong feeling either way
> >> [ ] -1 do not release one or both packages because ...
> >>
> >> Thanks!
> >>
>
--
*Regards,*
Wail Alkowaileet
not accept the donation because...
>
> The vote will be open for 7 days.
>
> Please vote,
> Till
>
--
*Regards,*
Wail Alkowaileet
ENSE file.
> >
> >
> > The vote is open for 72 hours, or until the necessary number of votes
> > (3 +1) has been reached.
> >
> > Please vote
> > [ ] +1 release these packages as Apache AsterixDB 0.9.7 and
> > Apache Hyracks 0.3.7
> > [ ] 0 No strong feeling either way
> > [ ] -1 do not release one or both packages because ...
> >
> > Thanks!
>
--
*Regards,*
Wail Alkowaileet
They are used as input files for external datasets
On Wed, Sep 23, 2020, 09:24 Wail Alkowaileet wrote:
> They are used for the integration test.
>
> On Tue, Sep 22, 2020, 22:44 Till Westmann wrote:
>
>> Hi Wail,
>>
>> Could you provide a bit of context h
we try to avoid it where possible but it's hard to
> > sometimes. IMO, as long as the file is small and changes very rarely,
> > it's fine.
> > If it's not maybe there's some way to generate it on the fly from a
> > textual representation?
> >
> > On Tue
Devs,
Is it ok to have binary files (parquet files) as part of the code-base?
--
*Regards,*
Wail Alkowaileet
used to sign the release can be
> > > found at
> > >
> > > https://dist.apache.org/repos/dist/release/asterixdb/KEYS
> > >
> > > RAT was executed as part of Maven via the RAT maven plugin, but
> > > excludes files that are:
> > >
> > > - data for tests
> > > - procedurally generated,
> > > - or source files which come without a header mentioning their license,
> > > but have an explicit reference in the LICENSE file.
> > >
> > >
> > > The vote is open for 72 hours, or until the necessary number of votes
> > > (3 +1) has been reached.
> > >
> > > Please vote
> > > [ ] +1 release these packages as Apache AsterixDB 0.9.5 and
> > > Apache Hyracks 0.3.5
> > > [ ] 0 No strong feeling either way
> > > [ ] -1 do not release one or both packages because ...
> > >
> > > Thanks!
> > >
> >
>
--
*Regards,*
Wail Alkowaileet
nd not visible to upper operators?). If you
want to make it happen, a workaround might be introducing a project
operator within the subplan of the subplan operator? Actually, if a
variable is not used, isn't project operator supposed to remove them
automatically by IntroduceProjectsRule?
Best,
Taew
che/asterixdb/blob/f2c18aa9646238ab2487ce3a964edfe3e61dd6e1/hyracks-fullstack/algebricks/algebricks-rewriter/src/main/java/org/apache/hyracks/algebricks/rewriter/rules/ExtractCommonExpressionsRule.java#L192>
--
*Regards,*
Wail Alkowaileet
I think that works.
On Fri, Oct 11, 2019 at 11:46 AM Till Westmann wrote:
> Good question.
> We could also consider to change the test to run uncompressed to
> maintain some test coverage for the uncompressed case.
>
> Thoughts?
>
> Cheers,
> Till
>
> On 11 Oct 2
eate dataset DBLP1(DBLPType)primary key idwith
> {"storage-block-compression": {"scheme": "none"}};*
>
> ...or override the default in the config file to be none:
>
>
> *storage.compression.block = none*
>
>
> I expect this to be merged into master today.
>
> Thanks,
>
> -MDB
>
--
*Regards,*
Wail Alkowaileet
> (3 +1) has been reached.
>
> Please vote
> [ ] +1 release these packages as Apache AsterixDB 0.9.5 and
> Apache Hyracks 0.3.5
> [ ] 0 No strong feeling either way
> [ ] -1 do not release one or both packages because ...
>
> Thanks!
>
--
*Regards,*
Wail Alkowaileet
b321",
> "B2": "b322"
> }]
> }
> ]);
>
> FROM root, root.B as B
> SELECT root.A, (
> FROM B
> LET I = I + 1
> SELECT B.B1, B.B2, I
> ) AS B;
>
>
>
> Basically I would like I to be an index of the occurrence of B being
> produced. Would be value 1 or 2.
>
> Output should look like this:
>
> [ { "A": "a1", "B": [ { "B1": "b111", "B2": "b112", "I": 1 }, { "B1":
> "b121", "B2": "b122", "I": 2 } ] }
> , { "A": "a2", "B": [ { "B1": "b211", "B2": "b212", "I": 1 }, { "B1":
> "b221", "B2": "b222", "I": 2 } ] }
> , { "A": "a3", "B": [ { "B1": "b311", "B2": "b312", "I": 1 }, { "B1":
> "b321", "B2": "b322", "I": 2 } ] }
> ]
>
> Is there a way to do that?
>
> Thank you!
>
> Fady
>
>
--
*Regards,*
Wail Alkowaileet
ure, I was thinking of introducing
a version number for the value format. This would allow us to change the
value's format without introducing new Log Type.
Thoughts?
--
*Regards,*
Wail Alkowaileet
g is done outside the filter
> modification, by virtue of the other locks around component modifications.
> So there shouldn’t be any within the filters themselves. Did you notice
> unusual behavior?
>
> > On Dec 8, 2018, at 21:21, Wail Alkowaileet wrote:
> >
Dev,
Is the in-memory LSMCompnentFilter (min and max) tuples thread safe? I
could not notice any locking mechanism that guarantees they are thread safe?
--
*Regards,*
Wail Alkowaileet
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_131]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> [?:1.8.0_131]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
> [?:1.8.0_131]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
>
--
*Regards,*
Wail Alkowaileet
at 8:47 PM, Taewoo Kim <wangs...@gmail.com> wrote:
> I have two questions. How would you want to compare two complex objects?
> And why do we need to do a hash?
>
> On Fri, Dec 29, 2017 at 20:31 Wail Alkowaileet <wael@gmail.com> wrote:
>
> > I think we shou
t; > >
> > >
> > >> On 12/28/17 11:14 PM, Taewoo Kim wrote:
> > >> If I remember correctly, we don't support deep equality comparison in
> > >> AsterixDB yet.
> > >>
> > >> Best,
> > >> Taewoo
> > >&g
: Unsupported type: comparison operations (>, >=, <, and <=)
cannot process input type array
What should be the semantics for such operations?
--
*Regards,*
Wail Alkowaileet
One thing I'm not sure about, there are pom.xml.versionsBackup in AsterixDB
modules.
On Thu, Dec 21, 2017 at 11:39 AM, Wail Alkowaileet <wael@gmail.com>
wrote:
> +1
> Downloaded
> Verified signatures and hashes
> Verified source build + ran unit tests and integration tests
are chained
and works per tuple basis (almost an iterator-model in a frame) and can be
more expensive in terms of function calls.
Any suggestions?
--
*Regards,*
Wail Alkowaileet
t;> I don't think it's the case...I tried on my local env, and it's using a
> >> primary index lookup instead of scan. Can you make sure the spelling of
> >> the
> >> primary key is correct?
> >>
> >> On Sun, Dec 3, 2017 at 3:49 PM, Wail Alkowaile
can and then filter the result, even though the query
predicate is on the primary key?
--
*Regards,*
Wail Alkowaileet
ted. As far as I remember,
>>> complex polygons such as self interesting polygons and polygons with
>>> holes
>>> are not supported. Not sure if they are supported now.
>>>
>>> Sattam
>>>
>>> On Nov 30, 2017 3:50 AM, "Wail Alkowailee
nforcement to any constraint.
--
*Regards,*
Wail Alkowaileet
BJECT.
> Anybody knows what that is?
>
> Thanks,
> Abdullah.
>
--
*Regards,*
Wail Alkowaileet
+1
On Nov 27, 2017 19:24, "abdullah alamoudi" wrote:
> Dear devs,
> We would like to get your input on changing the syntax of the merge policy
> definition.
>
> Current syntax:
>prefix_merge (("number"="123"),("size"="
> 456"));
>
> Proposed syntax
> {"compaction
I see :)
Thanks!
On Sat, Nov 18, 2017 at 7:23 AM, Mike Carey <dtab...@gmail.com> wrote:
> They were future thoughts.
>
> On Nov 17, 2017 5:15 PM, "Wail Alkowaileet" <wael@gmail.com> wrote:
>
> > Hi all,
> >
> > There are few
Hi all,
There are few types that never been used (e.g ENUM, TYPE, unsigned
integers). Are they still there for legacy reason/back compatibility ?
--
*Regards,*
Wail Alkowaileet
P.S Spark implementation is under Apache.
On Sat, Oct 28, 2017 at 9:41 AM, Wail Alkowaileet <wael@gmail.com>
wrote:
> Android has an implementation:
> https://github.com/retrostreams/android-retrostreams/blob/master/src/
> main/java/java9/util/TimSort.java
>
> Spark
> On Oct 23, 2017, at 7:14 PM, Wail Alkowaileet <wael@gmail.com>
> wrote:
> >
> > Dear devs,
> >
> > I have a question regarding the opTracker. Currently, we initialize one
> > opTracker per dataset in every NC.
> >
> > My question
aco...@ucr.edu>
> wrote:
> > >
> > > > If only that worked for me :( I have even tried deleting the m2
> > > repository
> > > > cache completely.
> > > > Steven
> > > >
> > > > On Thu, Sep 28, 2017 at 8:19 PM Wail Alkow
QueryTranslator.java:336)
>
> at org.apache.asterix.api.http.server.ApiServlet.post(ApiServlet.java:162)
>
> at org.apache.hyracks.http.server.AbstractServlet.handle(
> AbstractServlet.java:78)
>
> at org.apache.hyracks.http.server.HttpRequestHandler.handle(
> HttpRequestHandler.java:70)
>
> at org.apache.hyracks.http.server.HttpRequestHandler.call(
> HttpRequestHandler.java:55)
>
> at org.apache.hyracks.http.server.HttpRequestHandler.call(
> HttpRequestHandler.java:36)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
>
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
>
> at java.lang.Thread.run(Thread.java:745)
>
--
*Regards,*
Wail Alkowaileet
or.QueryTranslator.
> rewriteCompileQuery(
> >>> QueryTranslator.java:1833)
> >>> at org.apache.asterix.app.translator.QueryTranslator.lambda$han
> >>> dleQuery$1(
> >>> QueryTranslator.java:2306)
> >>> at org.apache.asterix.app.translator.QueryTranslator.createAndRunJob(
> >>> QueryTranslator.java:2406)
> >>> at org.apache.asterix.app.translator.QueryTranslator.
> >>> deliverResult(QueryTranslator.java:2339)
> >>> at org.apache.asterix.app.translator.QueryTranslator.
> >>> handleQuery(QueryTranslator.java:2318)
> >>> at org.apache.asterix.app.translator.QueryTranslator.
> compileAndExecute(
> >>> QueryTranslator.java:370)
> >>> at org.apache.asterix.app.translator.QueryTranslator.
> compileAndExecute(
> >>> QueryTranslator.java:253)
> >>> at org.apache.asterix.api.http.server.ApiServlet.post(ApiServle
> >>> t.java:153)
> >>> at org.apache.hyracks.http.server.AbstractServlet.handle(
> >>> AbstractServlet.java:78)
> >>> at org.apache.hyracks.http.server.HttpRequestHandler.
> >>> handle(HttpRequestHandler.java:70)
> >>> at org.apache.hyracks.http.server.HttpRequestHandler.
> >>> call(HttpRequestHandler.java:55)
> >>> at org.apache.hyracks.http.server.HttpRequestHandler.
> >>> call(HttpRequestHandler.java:36)
> >>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> >>> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> >>> ThreadPoolExecutor.java:1142)
> >>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> >>> ThreadPoolExecutor.java:617)
> >>> at java.lang.Thread.run(Thread.java:745)
> >>> Caused by: java.lang.IllegalStateException:
> >>> java.lang.ClassNotFoundException:
> >>> org.apache.asterix.runtime.evaluators.functions.records.
> >>> FieldAccessByIndexDescriptor$_Gen
> >>> at org.apache.asterix.runtime.functions.FunctionCollection.
> >>> getGeneratedFunctionDescriptorFactory(FunctionCollection.java:656)
> >>> at org.apache.asterix.runtime.functions.FunctionCollection.<
> >>> clinit>(FunctionCollection.java:631)
> >>> ... 52 more
> >>> Caused by: java.lang.ClassNotFoundException:
> org.apache.asterix.runtime.
> >>> evaluators.functions.records.FieldAccessByIndexDescriptor$_Gen
> >>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> >>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> >>> at org.apache.asterix.runtime.functions.FunctionCollection.
> >>> getGeneratedFunctionDescriptorFactory(FunctionCollection.java:652)
> >>> ... 53 more
> >>>
> >>> In my machine the code works fine. In fresh machine it doesn't. When I
> >>> built the master first and the given branch next it works fine. The
> code
> >>> runs all the integration tests in gerrit also successfully. The error
> is
> >>> occuring at "getGeneratedFunctionDescriptorFactory" function at the
> line
> >>> "Class generatedCl = cl.getClassLoader().loadClass(className);"
> where
> >>> it
> >>> calls for loadclass.
> >>>
> >>> I am completely puzzled by this behaviour in a fresh clone of the
> branch.
> >>> Any insite into this if any would be highly helpful. I am unable to
> find
> >>> the root cause becaue it occurs only in a fresh clone and when master
> is
> >>> not built before my branch. Kindly help me figure out the issue. Have I
> >>> changed the structure so badly that I am breaking everything?
> >>> Kindly help.
> >>>
> >>> Thank you.
> >>> Sincerely,
> >>> Riyafa
> >>>
> >>> [1] https://github.com/riyafa/asterixdb
> >>> [2] http://localhost:19001/
> >>> [3] https://asterix-gerrit.ics.uci.edu/#/c/1838/
> >>>
> >>
> >>
>
>
--
*Regards,*
Wail Alkowaileet
r JSON which causes a license issue in the code
>>> I
>>> have written[3]. What can I do about this issue?
>>>
>>> [1] https://asterix-gerrit.ics.uci.edu/1838
>>> [2]https://github.com/Esri/geometry-api-java
>>> [3]
>>> https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integ
>>> ration-tests/3290/
>>>
>>> Thank you.
>>> Yours sincerely,
>>> Riyafa
>>>
>>
>
--
*Regards,*
Wail Alkowaileet
nction calls are splitted. Is there a
reason why is that the case? (There's another example that reproduces the
same behavior)
- That leads to my next question, I see no rule for "FieldAccessNested"
which can be exploited here to save few function calls. Can this function
interfere with other functions/access methods?
--
*Regards,.*
Wail Alkowaileet
nd to get a grasp of the task.
>>>>>>
>>>>>> I am under the impression that the package *org.apache.asterix.om
>>>>>> <http://org.apache.asterix.om> *has the classes for handling data
>>>>>> models
>>>>>> for AsterixDB and have been looking into them to figure out the
>>>>>> implementation details. Please correct me if I am mistaken.
>>>>>>
>>>>>> I have also been reading on the specification for well known text[1]
>>>>>> and
>>>>>> GeoJSON[2] and have been trying to figure out if implementing one of
>>>>>> them
>>>>>> would suffice (if so which one) or if both needs to be implemented. If
>>>>>> both
>>>>>> needs to be implemented we should decide which needs to be implemented
>>>>>> first. I was thinking of going for GeoJSON as it seems to have a wider
>>>>>> usage.
>>>>>>
>>>>>> Any suggestions on how I should proceed with the project would be
>>>>>> highly
>>>>>> valued.
>>>>>>
>>>>>> [1] http://docs.opengeospatial.org/is/12-063r5/12-063r5.html
>>>>>> [2] https://tools.ietf.org/html/rfc7946
>>>>>>
>>>>>> Thank you.
>>>>>>
>>>>>> Yours sincerely,
>>>>>> Riyafa
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>
--
*Regards,*
Wail Alkowaileet
criptorParsingException
> >
> > I read this link and search google. Found remove the whole repository may
> > works. I do it and do not work. I even try to remove the
> > "C:\Users\zater\git\asterixdb\hyracks-fullstack\hyracks\
> hyracks-maven-plugins\license-automation-plugin\target\classes",
> > also do not work. Can you give me some guidance on it? Thanks!
> > Zater
> > 2017/4/24
>
--
*Regards,*
Wail Alkowaileet
instance shutdown will the shutdown
>>> hook wait until the message is delivered and processed?
>>>
>>
>> I agree with Murtadha, that I can certainly be done. However, we also
>> need to assume that some shutdowns won’t be clean and so the messages might
>> not be received. So it might be easier to just be able to recover from
>> missing messages than to be able to recover *and* to synchronize on
>> shutdown. Just a thought - maybe that’s not even an issue for your use-case.
>>
>> Cheers,
>> Till
>>
>
>
--
*Regards,*
Wail Alkowaileet
"http://scai01.cs.ucla.edu:19002/admin/cluster/node/red9/threaddump;
> > }
> > ],
> > "shutdownUri": "http://scai01.cs.ucla.edu:19002/admin/shutdown;,
> > "state": "ACTIVE",
> > "versionUri": "http://scai01.cs.ucla.edu:19002/admin/version;
> > }
> >
> > 2.Catalog_return:2.28G
> >
> > catalog_sales:31.01G
> >
> > inventory:8.63G
> >
> > 3.As for Pig and Hive, I always use the default configuration. I didn't
> set
> > the partition things for them. And for Spark, we use 200 partitions,
> which
> > may be improved and just not bad. For AsterixDB, I also set the cluster
> > using default value of partition and JVM things (I didn't manually set
> > these parameters).
> >
> >
> >
> > On Tue, Dec 20, 2016 at 5:58 PM, Yingyi Bu <buyin...@gmail.com> wrote:
> >
> > > Mingda,
> > >
> > > 1. Can you paste the returned JSON of http:// > > node>:19002/admin/cluster at your side? (Pls replace with
> > the
> > > actual master node name or IP)
> > > 2. Can you list the individual size of each dataset involved in
> the
> > > query, e.g., catalog_returns, catalog_sales, and inventory? (I assume
> > > 100GB is the overall size?)
> > > 3. Do Spark/Hive/Pig saturate all CPUs on all machines, i.e., how
> > many
> > > partitions are running on each machine? (It seems that your AsterixDB
> > > configuration wouldn't saturate all CPUs for queries --- in the current
> > > AsterixDB master, the computation parallelism is set to be the same as
> > the
> > > storage parallelism (i.e., the number of iodevices on each NC). I've
> > > submitted a new patch that allow flexible computation parallelism,
> which
> > > should be able to get merged into master very soon.)
> > > Thanks!
> > >
> > > Best,
> > > Yingyi
> > >
> > > On Tue, Dec 20, 2016 at 5:44 PM, mingda li <limingda1...@gmail.com>
> > wrote:
> > >
> > > > Oh, sure. When we test the 100G multiple join, we find AsterixDB is
> > > slower
> > > > than Spark (but still faster than Pig and Hive).
> > > > I can share with you the both plots: 1-10G.eps and 1-100G.eps. (We
> will
> > > > only use 1-10G.eps in our paper).
> > > > And thanks for Ian's advice:* The dev list generally strips
> > attachments.
> > > > Maybe you can just put the config inline? Or link to a
> pastebin/gist?*
> > > > I know why you can't see the attachments. So I move the plots with
> two
> > > > documents to my Dropbox.
> > > > You can find the
> > > > 1-10G.eps here: https://www.dropbox.com/s/
> > rk3xg6gigsfcuyq/1-10G.eps?dl=0
> > > > 1-100G.eps here:https://www.dropbox.com/s/tyxnmt6ehau2ski/1-100G.eps
> ?
> > > dl=0
> > > > cc_conf.pdf here: https://www.dropbox.com/s/
> > y3of1s17qdstv5f/cc_conf.pdf?
> > > > dl=0
> > > > CompleteQuery.pdf here:
> > > > https://www.dropbox.com/s/lml3fzxfjcmf2c1/CompleteQuery.pdf?dl=0
> > > >
> > > > On Tue, Dec 20, 2016 at 4:40 PM, Tyson Condie <
> tcondie.u...@gmail.com>
> > > > wrote:
> > > >
> > > > > Mingda: Please also share the numbers for 100GB, which show
> AsterixDB
> > > not
> > > > > quite doing as well as Spark. These 100GB results will not be in
> our
> > > > > submission version, since they’re not needed for the desired
> message:
> > > > > picking the right join order matters. Nevertheless, I’d like to
> get a
> > > > > better understanding of what’s going on in the larger dataset
> regime.
> > > > >
> > > > >
> > > > >
> > > > > -Tyson
> > > > >
> > > > >
> > > > >
> > > > > From: Yingyi Bu [mailto:buyin...@gmail.com]
> > > > > Sent: Tuesday, December 20, 2016 4:30 PM
> > > > > To: dev@asterixdb.apache.org
> > > > > Cc: Michael Carey <mjca...@ics.uci.edu>; Tyson Condie <
> > > > > tcondie.u...@gmail.com>
> > > > > Subject: Re: Time of Multiple Joins in AsterixDB
> > > > >
> > > > >
> > > > >
> > > > > Hi Mingda,
> > > > >
> > > > >
> > > > >
> > > > > It looks that you didn't attach the pdf?
> > > > >
> > > > > Thanks!
> > > > >
> > > > >
> > > > >
> > > > > Best,
> > > > >
> > > > > Yingyi
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Dec 20, 2016 at 4:15 PM, mingda li <limingda1...@gmail.com
> > > > > <mailto:limingda1...@gmail.com> > wrote:
> > > > >
> > > > > Sorry for the wrong version of cc.conf. I convert it to pdf version
> > as
> > > > > attachment.
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Dec 20, 2016 at 4:06 PM, mingda li <limingda1...@gmail.com
> > > > > <mailto:limingda1...@gmail.com> > wrote:
> > > > >
> > > > > Dear all,
> > > > >
> > > > >
> > > > >
> > > > > I am testing different systems' (AsterixDB, Spark, Hive, Pig)
> > multiple
> > > > > joins to see if there is a big difference with different join
> order.
> > > This
> > > > > is the reason for our research on multiple join and the result will
> > > > apppear
> > > > > in our paper which is to be submitted to VLDB soon. Could you help
> us
> > > to
> > > > > make sure that the test results make sense for AsterixDB?
> > > > >
> > > > >
> > > > >
> > > > > We configure the AsterixDB 0.8.9 ( use
> asterix-server-0.8.9-SNAPSHOT-
> > > > binary-assembly)
> > > > > in our cluster of 16 machines, each with a 3.40GHz i7 processor (4
> > > cores
> > > > > and 2 hyper-threads per core), 32GB of RAM and 1TB of disk
> capacity.
> > > The
> > > > > operating system is 64-bit Ubuntu 12.04. JDK version 1.8.0. During
> > > > > configuration, I follow the NCService instruction here
> > > > > https://ci.apache.org/projects/asterixdb/ncservice.html. And I set
> > the
> > > > > cc.conf as in attachment. (Each node work as nc and the first node
> > also
> > > > > work as cc).
> > > > >
> > > > >
> > > > >
> > > > > For experiment, we use 3 fact tables from TPC-DS: inventory;
> > > > > catalog_sales; catalog_returns with TPC-DS scale factor 1g and 10g.
> > The
> > > > > multiple join query we use in AsterixDB are as following:
> > > > >
> > > > >
> > > > >
> > > > > Good Join Order: SELECT COUNT(*) FROM (SELECT * FROM catalog_sales
> > cs1
> > > > > JOIN catalog_returns cr1
> > > > >
> > > > > ON (cs1.cs_order_number = cr1.cr_order_number AND cs1.cs_item_sk =
> > > > > cr1.cr_item_sk)) m1 JOIN inventory i1 ON i1.inv_item_sk =
> > > > cs1.cs_item_sk;
> > > > >
> > > > >
> > > > >
> > > > > Bad Join Order: SELECT COUNT(*) FROM (SELECT * FROM catalog_sales
> cs1
> > > > JOIN
> > > > > inventory i1 ON cs1.cs_item_sk = i1.inv_item_sk) m1 JOIN
> > > catalog_returns
> > > > > cr1 ON (cs1.cs_order_number = cr1.cr_order_number AND
> cs1.cs_item_sk
> > =
> > > > > cr1.cr_item_sk);
> > > > >
> > > > >
> > > > >
> > > > > We load the data to AsterixDB firstly and run the two different
> > > queries.
> > > > > (The complete version of all queries for AsterixDB is in
> attachment)
> > > We
> > > > > assume the data has already been stored in AsterixDB and only count
> > the
> > > > > time for multiple join.
> > > > >
> > > > >
> > > > >
> > > > > Meanwhile, we use the same dataset and query to test Spark, Pig and
> > > Hive.
> > > > > The result is shown in the attachment's figure. And you can find
> > > > > AsterixDB's time is always better than others no matter good or
> bad
> > > > > order:-) (BTW, the y scale of figure is time in log scale. You can
> > see
> > > > the
> > > > > time by the label of each bar.)
> > > > >
> > > > >
> > > > >
> > > > > Thanks for your help.
> > > > >
> > > > >
> > > > >
> > > > > Bests,
> > > > >
> > > > > Mingda
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>
--
*Regards,*
Wail Alkowaileet
Hi all,
Unfortunately I won't be able to attend this meeting. And here's my status:
- I was mainly working on the Tuple-Level compaction.
- helped MIT folks on exporting and query the data.
- Finally, there is an initial plan to incorporate Cloudberry in one of our
project (It's targeted for a
My mistake... That's only for Twitter feed.
On Nov 3, 2016 14:11, "Wail Alkowaileet" <wael@gmail.com> wrote:
> Dears,
>
> Currently, unordered list is the default type of JSON array if it resides
> in the open part.
> That means the user won't be able to ac
t's not a problem).
2- However, returning one field
count( for $x in dataset Tweets
return $x.id
)
=> Worked just fine.
I'm just wondering, does the projection in count() affects its performance ?
--
*Regards,*
Wail Alkowaileet
for
it's being too wordy.
Here's the link to the design document:
https://docs.google.com/document/d/1Rrhz2Kn9GLJ2OhPbmoHQjth85EA8JExSD7Xnjw7F5aA/edit?usp=sharing
Your feedback is highly appreciated.
--
*Regards,*
Wail Alkowaileet
;> mentioned the usage issue for a different perspective.
>>
>> Thanks
>> Ahmed
>>
>> On Wed, Sep 14, 2016 at 10:30 AM Mike Carey <dtab...@gmail.com> wrote:
>>
>> ☺!
>>>
>>> On Sep 14, 2016 1:11 AM, "Wail Alkowaileet" <wa
wesome if we can bridge non-ADM to ADM types.
--
*Regards,*
Wail Alkowaileet
>
> Best,
>
> Yingyi
>
>
> On Fri, Aug 26, 2016 at 3:44 PM, Wail Alkowaileet <wael@gmail.com>
> wrote:
>
> > Hi AsterixDBers.
> >
> > Is there any easy way to push-down filter to an external source (in my
> case
> > Parquet) without bein
rst round files which can not be GCed.
> > Could you provide the query plan as well?
> >
> > > On Aug 24, 2016, at 10:02 PM, Wail Alkowaileet <wael@gmail.com>
> > wrote:
> > >
> > > Hi Ian and Pouria,
> > >
> > > The na
"No space left on device") is just casted from
> the
> > native IOException. Therefore I would be inclined to believe it's
> genuinely
> > out of space. I suppose the question is why the external sort is so huge.
> > What is the query plan? Maybe that will shed light o
arning, do a quick "df -i" on the device -
>> possibly you've run out of inodes even if the space isn't all used up.
>> It's
>> unlikely because I don't think AsterixDB creates a bunch of small files,
>> but worth checking.
>>
>> If that's not it, then can you share
, 2016 at 9:06 AM, Wail Alkowaileet <wael@gmail.com>
wrote:
> Dears,
>
> I have a dataset of size 290GB loaded in a 3 NCs each of which has 2x500GB
> SSD.
>
> Each of NC has two IODevices (partitions) in each hard drive (i.e the
> total is 4 iodevices per NC). After loa
hard drive (approximately
about 250GB free space in each hard drive). However, when I tried to create
an index of type RTree, I got an exception that no space left in the hard
drive during the External Sort phase.
Is that normal ?
--
*Regards,*
Wail Alkowaileet
One more thing:
Can you paste your cluster configuration as well?
Thanks
On Wed, Aug 3, 2016 at 12:32 PM, Wail Alkowaileet <wael@gmail.com>
wrote:
> Hi Kevin,
>
> Thanks for testing it! I really appreciate it.
> Definitely I tested it on my network (KACST) and I just
I have downloaded and built
> it with "sbt package && sbt assembly && sbt publish". Is there any other
> configuration on zeppelin side that I should do?
>
> > On Jul 18, 2016, at 05:26, Wail Alkowaileet <wael@gmail.com> wrote:
> >
> > Sorry.
gt;> tried my hand at using sed to fix it. The two things I think that
> >>> should
> >>> > >> make this hack work are: replacing the i32/i64 suffixes (so just
> >>> > s/i32//g)
> >>> > >> and removing decimal suffixes (/s/\([0-9]\.[0-9]\)d/\1/g). This
> >>> gives
> >>> > >> output to me, that seems like it is "correct". But the parser is
> >>> still
> >>> > >> complaining and I don't understand why. It fails at line 619,
> column
> >>> > 228.
> >>> > >> The tweet on that line, and the one above it, work fine if I just
> >>> use an
> >>> > >> insert statement.
> >>> > >>
> >>> > >> Does anyone have any thoughts as to maybe what's causing it to not
> >>> take
> >>> > >> this input? I'm hoping it's just something silly I am too tired to
> >>> > see...
> >>> > >> Thanks in advance for any thoughts/suggestions.
> >>> > >>
> >>> > >> -Ian
> >>> > >>
> >>> > >
> >>> > >
> >>> >
> >>>
> >>
> >>
> >
>
--
*Regards,*
Wail Alkowaileet
67 matches
Mail list logo