Re: Importing CSV with empty values

2022-05-11 Thread Michael Carey
@Nicholas:  Did this solve it, hopefully? On 5/9/22 7:10 PM, Ali Alsuliman wrote: Nicholas, The parser does not know how to handle (or rather what value to produce for) empty values when the field is numeric as in your case (int32). That's why it's complaining. The parser cannot produce 0 for

Re: Increasing degree of parallelism when reading Parquet files

2021-08-10 Thread Michael Carey
to the number of I/O devices. However, with what Dmitry said, I guess that this is expected behavior and the flag should only influence the degree of parallelism after exchanges (which I don't have in my queries). Cheers, Ingo -Original Message- From: Michael Carey Sent: Monday, Aug

Re: Increasing degree of parallelism when reading Parquet files

2021-08-09 Thread Michael Carey
Ingo, Q: In your Parquet/S3 testing, what does your current cluster configuration look like?  (I.e., how many partitions have you configured it with - physical storage partitions that is?)  Even though your S3 data isn't stored inside AsterixDB in this case, the system still uses that info

Re: what parallel DBMS is AsterixDB compared against?

2019-08-07 Thread Michael Carey
nyone know if Greenplum has the "DeWitt clause"? On Wed, Aug 7, 2019 at 4:20 PM Michael Carey <mailto:mjca...@ics.uci.edu>> wrote: (Meant to reply to the list!) Forwarded Message Subject:Re: what parallel DBMS is AsterixDB compared against? Date:

New SQL++ book!

2018-11-04 Thread Michael Carey
FYI:  There is now a free PDF download of Don Chamberlin's terrific new book "SQL++ for SQL Users" available to Apache AsterixDB users on the Apache AsterixDB web site. (It's available in the Documentation pulldown.)  It's also available to non-Apache AsterixDB users on Amazon.  :-)

Re: Hyracks Job Requirement Configuration

2018-01-29 Thread Michael Carey
Rana's work shows a clear user requirement (@Xikui pay attention :-)) -- we need two forms of parallelism hint, one that does what we currently do - which is widen the parallelism AFTER reading from storage at the first opportunity to do so - and another that widens it IMMEDIATELY (somehow

Re: AsterixDB Performance Tuning

2018-01-26 Thread Michael Carey
Also:  What are the data sizes in the two systems? On 1/26/18 10:00 AM, Taewoo Kim wrote: Hi Rana, Thank you for attaching your plan. It seems that the selections are correctly made before each join. If your query predicate is selective enough (e.g., I.LABEL = 'Haptoglobin' generates less

Re: Parse GeoJSON data into a record in AsterixDB

2017-06-22 Thread Michael Carey
Cool! On 6/22/17 7:58 AM, Riyafa Abdul Hameed wrote: Dear all, Thank you very much. I hadn't thought of an AnyObject type. Now I am able to parse GeoJSON using the following: DROP DATAVERSE GeoData IF EXISTS; CREATE DATAVERSE GeoData; USE GeoData; CREATE TYPE AnyObject AS {}; CREATE TYPE