[
https://issues.apache.org/jira/browse/MADLIB-995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Frank McQuillan updated MADLIB-995:
-----------------------------------
Description:
Story
As a data scientist, I want to be able to define multiple symbols that result
in overlapping partitions.
See
http://madlib.incubator.apache.org/docs/latest/group__grp__path.html
for a description of what a symbol is.
Currently in 1.9, overlapping partitions are not supported. The default is
non-overlapping, where the path algo begins the next pattern search at the row
that follows the last pattern match (like how grep works in UNIX).
In the case of overlapping, the path algo needs to find every occurrence of the
pattern in the partition, regardless of whether it might have been part of a
previously found match. This means one row can match multiple symbols in a
given matched pattern so there is a dependency on
https://issues.apache.org/jira/browse/MADLIB-943
Acceptance
was:
Story
As a data scientist, I want to be able to define multiple symbols that result
in overlapping partitions.
See
http://madlib.incubator.apache.org/docs/latest/group__grp__path.html
for a description of what a symbol is.
Currently in 1.9, overlapping partitions are not supported.
Acceptance
> Path - overlapping partitions
> -----------------------------
>
> Key: MADLIB-995
> URL: https://issues.apache.org/jira/browse/MADLIB-995
> Project: Apache MADlib
> Issue Type: New Feature
> Components: Module: Utilities
> Reporter: Frank McQuillan
> Fix For: v1.9.1
>
>
> Story
> As a data scientist, I want to be able to define multiple symbols that result
> in overlapping partitions.
> See
> http://madlib.incubator.apache.org/docs/latest/group__grp__path.html
> for a description of what a symbol is.
> Currently in 1.9, overlapping partitions are not supported. The default is
> non-overlapping, where the path algo begins the next pattern search at the
> row that follows the last pattern match (like how grep works in UNIX).
> In the case of overlapping, the path algo needs to find every occurrence of
> the pattern in the partition, regardless of whether it might have been part
> of a previously found match. This means one row can match multiple symbols in
> a given matched pattern so there is a dependency on
> https://issues.apache.org/jira/browse/MADLIB-943
> Acceptance
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)