[
https://issues.apache.org/jira/browse/BLUR-322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aaron McCurry updated BLUR-322:
-------------------------------
Description:
After spending the last few weeks using the most recent version of Blur off the
apache-blur-0.2 branch I have found a lot of issues surrounding the WAL, the
NRT Lucene implementation and the IndexImporter. The basic problems seem to be
locking issues between these various components.
There is now a simpler version of the BlurIndex (BlurIndexSimpleWriter) that
outperforms the current default implementation in a lot of ways. There are no
blocking issues and there are no index importing issues from MapReduce.
However the BlurIndexSimpleWriter is missing a WAL and it does not perform well
at NRT updates.
So this issue to create new data ingestion methods in the Thrift API to allow
for a more bulk like updates.
The basic methods should be something like:
string startTransaction()
void replaceRow(Row)
void replaceRows(list<Row>)
void delete Row(string id)
void rollbackTransaction(string)
void commitTransaction(string)
This type of update mechanism should have performance inline with MapReduce
indexing.
was:
After spending the last few weeks using the most recent version of Blur off the
apache-blur-0.2 branch I have found a lot of issues surrounding the WAL, the
NRT Lucene implementation and the IndexImporter. The basic problems seem to be
locking issues between these various components.
There is now a simpler version of the BlurIndex (BlurIndexSimpleWriter) that
outperforms the current default implementation in a lot of ways. There are no
blocking issues and there are no index importing issues from MapReduce.
However the BlurIndexSimpleWriter is missing a WAL and it does not perform well
at NRT updates.
So this issue to create new data ingestion methods in the Thrift API to allow
for a more bulk like updates.
The basic methods should be something like:
string startTransaction()
void replaceRow(Row)
void replaceRows(list<Row>)
void delete Row(string id)
void rollbackTransaction(string)
void commitTransaction(string)
> Add a transactional data update call in thrift
> ----------------------------------------------
>
> Key: BLUR-322
> URL: https://issues.apache.org/jira/browse/BLUR-322
> Project: Apache Blur
> Issue Type: Improvement
> Components: Blur
> Affects Versions: 0.3.0, 0.2.2
> Reporter: Aaron McCurry
> Fix For: 0.3.0, 0.2.2
>
>
> After spending the last few weeks using the most recent version of Blur off
> the apache-blur-0.2 branch I have found a lot of issues surrounding the WAL,
> the NRT Lucene implementation and the IndexImporter. The basic problems seem
> to be locking issues between these various components.
> There is now a simpler version of the BlurIndex (BlurIndexSimpleWriter) that
> outperforms the current default implementation in a lot of ways. There are
> no blocking issues and there are no index importing issues from MapReduce.
> However the BlurIndexSimpleWriter is missing a WAL and it does not perform
> well at NRT updates.
> So this issue to create new data ingestion methods in the Thrift API to allow
> for a more bulk like updates.
> The basic methods should be something like:
> string startTransaction()
> void replaceRow(Row)
> void replaceRows(list<Row>)
> void delete Row(string id)
> void rollbackTransaction(string)
> void commitTransaction(string)
> This type of update mechanism should have performance inline with MapReduce
> indexing.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)