@Pouria here is Uber trip data
https://github.com/fivethirtyeight/uber-tlc-foil-response
On Sep 16, 2016 1:21 AM, "Chen Li" wrote:
> @Wail: as a use case related to selectivity, our current Cloudberry
> prototype doesn't benefit from R-tree when the user is analyzing the data
For full-text search, I like "ftcontains()" since it's very intuitive.
Syntax for advanced full-text features such as stop words, analyzers, and
languages need a separate discussion.
Chen
On Thu, Sep 15, 2016 at 5:58 PM, Taewoo Kim wrote:
> @Till: I see. Thanks for the
@Till: I see. Thanks for the suggestion. It's more clearer now.
Best,
Taewoo
On Thu, Sep 15, 2016 at 5:58 PM, Till Westmann wrote:
> And as it turns out, we already have some infrastructure to translate a
> constant record constructor expression into a record in
>
And as it turns out, we already have some infrastructure to translate a
constant record constructor expression into a record in
LangRecordParseUtil.
So supporting that wouldn’t be too painful.
Cheers,
Till
On 15 Sep 2016, at 17:41, Till Westmann wrote:
One option to express those parameters,
One option to express those parameters, would be to pass in a (compile
time
constant) record/object. E.g.
where ftcontains($o.title, ["hello","hi"],
{ "combine": "and", "stop list": "default" })
That way we could have named optional parameters (please ignore the
Makes sense to me (especially as I always think about this specific one
as
"ftcontains" :) ).
Another thing you mentioned is about the parameters that will get added
in the
future. Could you provide an example for this?
Cheers,
Till
On 15 Sep 2016, at 15:37, Taewoo Kim wrote:
Maybe we
@Till: Got it. I agree to your opinion. The issue here for the full-text
search is that many function parameters that controls the behavior of
full-text search will be added in the future. Maybe this is not the issue?
:-)
Best,
Taewoo
On Thu, Sep 15, 2016 at 3:11 PM, Till Westmann
Done.
Best,
Yingyi
On Thu, Sep 15, 2016 at 2:43 PM, Taewoo Kim wrote:
> @Yingyi: Good to know that! I just gave you the permission to edit
> document. Please edit it as needed since I'm not familiar with every
> functions, just some that you mentioned.
>
> Best,
> Taewoo
>
@Yingyi: will add the mapping for "string-contains()" in AQL and
"contains()" in SQL++.
Best,
Taewoo
On Thu, Sep 15, 2016 at 2:45 PM, Yingyi Bu wrote:
> All right, if the AQL surface doesn't lead to special tweaks in the
> compiler, e.g., rewriting rules, the same
@Yingyi: Yes. It's just a syntactic sugar - it can be anything:
"contains_text", "contains text" or "containstext". It would be nice if
one form of function is used for both AQL and SQL++. Currently, to follow
the Xquery spec, this doesn't work.
Best,
Taewoo
On Thu, Sep 15, 2016 at 2:27 PM,
Hi Taewoo,
Recently I have added several string functions into *DB:
initcap(title),
regexp_like,
regexp_position,
ltrim,
trim,
rtrim,
position,
repeat,
split
(Replace '_' with '-' in function names for AQL.)
You can add them to the
Hi Taewoo,
Are those fulltext search syntax extensions are only a syntactic sugar
(i.e., surface) thing that is translated into functions?
In a not-too-distant-future, we will need to surface fulltext search in
SQL++, probably using the same functions like Oracle. If the AQL fulltext
I just talked to Mike to resolve 'text' and he suggested an idea to check
what other systems do. Fortunately, we have collected the information some
time ago. You can check the following sheet to see how other systems do.
There are many test cases that use *text* as one of its field name. We can
correct it using 'text' or `text`. But, if a user currently uses *text* as
its a field name of a dataset, then, clearly, yes, it will not work.
Best,
Taewoo
On Thu, Sep 15, 2016 at 2:02 PM, Chris Hillery
Making "text" a reserved word seems like a more breaking change than the
function names, doesn't it?
Ceej
aka Chris Hillery
On Sep 15, 2016 1:57 PM, "Taewoo Kim" wrote:
> Reminder:
>
> Related to the full-text search, a string function named *contains*() will
> be renamed
Reminder:
Related to the full-text search, a string function named *contains*() will
be renamed to *string-contains*() soon. Also, "*text*" will become a
reserved word just like "for" or "where". It will happen soon as the first
step to the full-text search merge. Here are more details about
@Wail: as a use case related to selectivity, our current Cloudberry
prototype doesn't benefit from R-tree when the user is analyzing the data
for the entire US. But we expect to have R-tree benefits when a user zooms
into a small region.
On Thu, Sep 15, 2016 at 8:25 AM, Wail Alkowaileet
@Wail
One quick question:
By any chance do you have some spatial data at a similar scale
(size/cardinality-wise) but with less (ideally without) duplicates ? I am
really curious to know if the core of your loading problem is because of
the size/setting that is being used or because of the
Hi Ahmed and Mike,
@Ahmed
I actually did a small experiment where I loaded about 1/5 of the data (so
I can index it) and seems that the R-Tree was really useful for querying
small regions or neighborhoods.
I also tried the B-Tree and it was slower than a full scan.
@Mike
Unfortunately, I cannot
19 matches
Mail list logo