[ 
https://issues.apache.org/jira/browse/SOLR-13621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13621:
----------------------------------
    Description: 
Streaming Expressions and Math Expressions are now mature on the query side. 
This includes the ability to query, transform, analyze and visualize data. 

It's now time to build the data loading and log parsing capabilities to apply 
the full suite of mathematics and visualizations over log data and CSV files.

The design is to have stream sources that read from a file system and stream 
decorators that parse different file and log formats. The data can be further 
transformed and joined with other data by stream decorators and sent to any 
Solr Cloud collection with the update Stream.

This design also allows Streaming Expressions to perform regex filtering, 
aggregations, statistical analysis and visualization directly over CSV files 
and log files before the data is loaded to Solr. Because of Streaming 
Expressions built in paralyzation capabilities this allows Solr Cloud to behave 
like a massively parallel *grep* engine. 

It also allows users to visualize data using Apache Zeppelin as part of loading 
process, to make it easier understand the data before it's loaded into an index.

This ticket will track the sub-tickets for the different log formats that will 
be supported.

  was:
Streaming Expressions and Math Expressions are now mature on the query side. 
This includes the ability to query, transform, analyze and visualize data. 

It's now time to build the data loading and log parsing capabilities to apply 
the full suite of mathematics and visualizations over log data and CSV files.

The design is to have stream sources that read from a file system and stream 
decorators that parse different file and log formats. The data can be further 
transformed and joined with other data by stream decorators and sent to any 
Solr Cloud collection with the update Stream.

This design also allows Streaming Expressions to perform regex filtering, 
aggregations, statistical analysis and visualizations directly over CSV files 
and log files before the data is loaded to Solr. Because of Streaming 
Expressions built in paralyzation capabilities this allows Solr Cloud to behave 
like a massively parallel *grep* engine. 

It also allows users to visualize data using Apache Zeppelin as part of loading 
process, to make it easier understand the data before it's loaded into an index.

This ticket will track the sub-tickets for the different log formats that will 
be supported.


> Visual log parsing and data loading framework for Streaming Expressions
> -----------------------------------------------------------------------
>
>                 Key: SOLR-13621
>                 URL: https://issues.apache.org/jira/browse/SOLR-13621
>             Project: Solr
>          Issue Type: New Feature
>          Components: streaming expressions
>            Reporter: Joel Bernstein
>            Assignee: Joel Bernstein
>            Priority: Major
>
> Streaming Expressions and Math Expressions are now mature on the query side. 
> This includes the ability to query, transform, analyze and visualize data. 
> It's now time to build the data loading and log parsing capabilities to apply 
> the full suite of mathematics and visualizations over log data and CSV files.
> The design is to have stream sources that read from a file system and stream 
> decorators that parse different file and log formats. The data can be further 
> transformed and joined with other data by stream decorators and sent to any 
> Solr Cloud collection with the update Stream.
> This design also allows Streaming Expressions to perform regex filtering, 
> aggregations, statistical analysis and visualization directly over CSV files 
> and log files before the data is loaded to Solr. Because of Streaming 
> Expressions built in paralyzation capabilities this allows Solr Cloud to 
> behave like a massively parallel *grep* engine. 
> It also allows users to visualize data using Apache Zeppelin as part of 
> loading process, to make it easier understand the data before it's loaded 
> into an index.
> This ticket will track the sub-tickets for the different log formats that 
> will be supported.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to