Re: Creating RTree: no space left

2016-09-15 Thread Khurram Faraaz
@Pouria here is Uber trip data https://github.com/fivethirtyeight/uber-tlc-foil-response On Sep 16, 2016 1:21 AM, "Chen Li" wrote: > @Wail: as a use case related to selectivity, our current Cloudberry > prototype doesn't benefit from R-tree when the user is analyzing the data

Re: Function name change: contains() -> string-contains()

2016-09-15 Thread Chen Li
For full-text search, I like "ftcontains()" since it's very intuitive. Syntax for advanced full-text features such as stop words, analyzers, and languages need a separate discussion. Chen On Thu, Sep 15, 2016 at 5:58 PM, Taewoo Kim wrote: > @Till: I see. Thanks for the

Re: Function name change: contains() -> string-contains()

2016-09-15 Thread Taewoo Kim
@Till: I see. Thanks for the suggestion. It's more clearer now. Best, Taewoo On Thu, Sep 15, 2016 at 5:58 PM, Till Westmann wrote: > And as it turns out, we already have some infrastructure to translate a > constant record constructor expression into a record in >

Re: Function name change: contains() -> string-contains()

2016-09-15 Thread Till Westmann
And as it turns out, we already have some infrastructure to translate a constant record constructor expression into a record in LangRecordParseUtil. So supporting that wouldn’t be too painful. Cheers, Till On 15 Sep 2016, at 17:41, Till Westmann wrote: One option to express those parameters,

Re: Function name change: contains() -> string-contains()

2016-09-15 Thread Till Westmann
One option to express those parameters, would be to pass in a (compile time constant) record/object. E.g. where ftcontains($o.title, ["hello","hi"], { "combine": "and", "stop list": "default" }) That way we could have named optional parameters (please ignore the

Re: Function name change: contains() -> string-contains()

2016-09-15 Thread Till Westmann
Makes sense to me (especially as I always think about this specific one as "ftcontains" :) ). Another thing you mentioned is about the parameters that will get added in the future. Could you provide an example for this? Cheers, Till On 15 Sep 2016, at 15:37, Taewoo Kim wrote: Maybe we

Re: Function name change: contains() -> string-contains()

2016-09-15 Thread Taewoo Kim
@Till: Got it. I agree to your opinion. The issue here for the full-text search is that many function parameters that controls the behavior of full-text search will be added in the future. Maybe this is not the issue? :-) Best, Taewoo On Thu, Sep 15, 2016 at 3:11 PM, Till Westmann

Re: Function name change: contains() -> string-contains()

2016-09-15 Thread Yingyi Bu
Done. Best, Yingyi On Thu, Sep 15, 2016 at 2:43 PM, Taewoo Kim wrote: > @Yingyi: Good to know that! I just gave you the permission to edit > document. Please edit it as needed since I'm not familiar with every > functions, just some that you mentioned. > > Best, > Taewoo >

Re: Function name change: contains() -> string-contains()

2016-09-15 Thread Taewoo Kim
@Yingyi: will add the mapping for "string-contains()" in AQL and "contains()" in SQL++. Best, Taewoo On Thu, Sep 15, 2016 at 2:45 PM, Yingyi Bu wrote: > All right, if the AQL surface doesn't lead to special tweaks in the > compiler, e.g., rewriting rules, the same

Re: Function name change: contains() -> string-contains()

2016-09-15 Thread Taewoo Kim
@Yingyi: Yes. It's just a syntactic sugar - it can be anything: "contains_text", "contains text" or "containstext". It would be nice if one form of function is used for both AQL and SQL++. Currently, to follow the Xquery spec, this doesn't work. Best, Taewoo On Thu, Sep 15, 2016 at 2:27 PM,

Re: Function name change: contains() -> string-contains()

2016-09-15 Thread Yingyi Bu
Hi Taewoo, Recently I have added several string functions into *DB: initcap(title), regexp_like, regexp_position, ltrim, trim, rtrim, position, repeat, split (Replace '_' with '-' in function names for AQL.) You can add them to the

Re: Function name change: contains() -> string-contains()

2016-09-15 Thread Yingyi Bu
Hi Taewoo, Are those fulltext search syntax extensions are only a syntactic sugar (i.e., surface) thing that is translated into functions? In a not-too-distant-future, we will need to surface fulltext search in SQL++, probably using the same functions like Oracle. If the AQL fulltext

Re: Function name change: contains() -> string-contains()

2016-09-15 Thread Taewoo Kim
I just talked to Mike to resolve 'text' and he suggested an idea to check what other systems do. Fortunately, we have collected the information some time ago. You can check the following sheet to see how other systems do.

Re: Function name change: contains() -> string-contains()

2016-09-15 Thread Taewoo Kim
There are many test cases that use *text* as one of its field name. We can correct it using 'text' or `text`. But, if a user currently uses *text* as its a field name of a dataset, then, clearly, yes, it will not work. Best, Taewoo On Thu, Sep 15, 2016 at 2:02 PM, Chris Hillery

Re: Function name change: contains() -> string-contains()

2016-09-15 Thread Chris Hillery
Making "text" a reserved word seems like a more breaking change than the function names, doesn't it? Ceej aka Chris Hillery On Sep 15, 2016 1:57 PM, "Taewoo Kim" wrote: > Reminder: > > Related to the full-text search, a string function named *contains*() will > be renamed

Re: Function name change: contains() -> string-contains()

2016-09-15 Thread Taewoo Kim
Reminder: Related to the full-text search, a string function named *contains*() will be renamed to *string-contains*() soon. Also, "*text*" will become a reserved word just like "for" or "where". It will happen soon as the first step to the full-text search merge. Here are more details about

Re: Creating RTree: no space left

2016-09-15 Thread Chen Li
@Wail: as a use case related to selectivity, our current Cloudberry prototype doesn't benefit from R-tree when the user is analyzing the data for the entire US. But we expect to have R-tree benefits when a user zooms into a small region. On Thu, Sep 15, 2016 at 8:25 AM, Wail Alkowaileet

Re: Creating RTree: no space left

2016-09-15 Thread Pouria Pirzadeh
@Wail One quick question: By any chance do you have some spatial data at a similar scale (size/cardinality-wise) but with less (ideally without) duplicates ? I am really curious to know if the core of your loading problem is because of the size/setting that is being used or because of the

Re: Creating RTree: no space left

2016-09-15 Thread Wail Alkowaileet
Hi Ahmed and Mike, @Ahmed I actually did a small experiment where I loaded about 1/5 of the data (so I can index it) and seems that the R-Tree was really useful for querying small regions or neighborhoods. I also tried the B-Tree and it was slower than a full scan. @Mike Unfortunately, I cannot