[ 
https://issues.apache.org/jira/browse/BLUR-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676631#comment-13676631
 ] 

Aaron McCurry commented on BLUR-112:
------------------------------------

Yes I believe that this is the correct approach.  I have just pushed a big 
commit that cleaned up all the old analyzer code and has replaced it with the 
new Double/Int/Float/Long/Text fields.

https://git-wip-us.apache.org/repos/asf?p=incubator-blur.git;a=commit;h=7bbf19d80aa3af80e5869b81827ffc8e8c700d87

So along with that work, the next piece is to make sure that an inbound Column 
type (this is the attribute that will need to be added) does not conflict with 
any defined types.  For example, in the table descriptor if the classname for a 
field is "family1.col1" is "int" then it will be parsed into an IntField and 
indexed that way.  However if the client tries to insert the Family/Column 
"family1.col1" as a "double" type in the Column then an exception needs to be 
thrown.

For terminology let's call Column types that are defined in the TableDescriptor 
as statically typed Columns.  And Columns that are added on the fly and dynamic 
Columns.

The dynamic Columns will need an additional guard (and it won't hurt to check 
on all the Columns).  Once a Column type has been defined for a Column on any 
shard in any shard server it cannot be redefined.  So the hardest issue with 
this situation is the race condition across servers.  Example:  Shard Server 1 
is getting mutates and receives a new dynamic Column let's call it 
"family1.col1234" and it's type is "text" and at the same moment Shard Server 2 
is getting mutates and receives the same dynamic Column of "family1.col1234" 
and it's type is "double".  One and only one of the types should win and the 
other should throw an exception.

There is code in 0.2.0 that provides a solution for this race condition by 
using ZooKeeper.

https://git-wip-us.apache.org/repos/asf?p=incubator-blur.git;a=blob;f=src/blur-util/src/main/java/org/apache/blur/zookeeper/ZkCachedMap.java;h=22eb9e64480e66b41da358d59060dc4331e1390c;hb=aef8938eb5987b5f19a3bd3260d5ebafcf6cf751

So we should pull in that class in from 0.2-dev.

I hope this response gives you an idea of what needs to be done conceptually 
and if you want to work on it I can help direct you to that various portions of 
code that will need to be modified.
                
> Allow for types to be set on blur tables
> ----------------------------------------
>
>                 Key: BLUR-112
>                 URL: https://issues.apache.org/jira/browse/BLUR-112
>             Project: Apache Blur
>          Issue Type: Improvement
>    Affects Versions: 0.1.5
>            Reporter: Aaron McCurry
>             Fix For: 0.1.5
>
>
> Create the ability for Blur to handle the default Lucene field types.  This 
> should not be tied to the table descriptor because types should be allowed to 
> be added at runtime.  Also 2 new fields should be added to the 
> TableDescriptor:
> 1. A strict types attribute.  If set to true, if a new column is added to the 
> table and there is no type mapping for it.  Throw an exception.  Set to false 
> by default.
> 2. Default type is strict is set to false.  The default type should be text.
> Also, dynamic columns could be allowed if their name included the type.  Such 
> as:
> The column name could be "col1" with a type of "int", in the Column struct in 
> thrift the name would be "col1/int" and if the type did not exist before the 
> call it would be added.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to