[jira] [Commented] (DERBY-590) How to integrate Derby with Lucene API?

Rick Hillegas (JIRA) Tue, 22 Oct 2013 08:42:39 -0700

    [ 
https://issues.apache.org/jira/browse/DERBY-590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13801935#comment-13801935
 ]


Rick Hillegas commented on DERBY-590:
-------------------------------------

Thanks for that additional information, Andrew. Concerning i, here are some 
thoughts:

1) I see that this new additional tool is being parked next to the existing 
DBMDWrapper and ForeignDBViews classes. I do agree that all of these tools 
should be neighbors. Just a heads-up, however: I think I put DBMDWrapper and 
ForeignDBViews in the wrong jar file to begin with. I now think those tools 
really ought to go into the engine jar, for reasons which I gave in a 
2013-06-13 comment on https://issues.apache.org/jira/browse/DERBY-6256. To 
summarize: the point of derbytools.jar is to hold code which runs both 
client-side and server-side. None of these tools run client-side. My point is 
this: I'm not faulting the patch for following the existing pattern; instead, 
I'm saying that I put the original tools in the wrong place. And at some point 
I may file a follow-on JIRA to move all of these tools into the engine jar.

2) If I understand the patch correctly, the indexTable() procedure really 
indexes a column. You could run the procedure multiple times on the same table 
in order to index different columns. I think that createDocumentIndex() would 
be a better name for this procedure.

3) Similarly, I'm not keen on the name luceneUpdateDocument() for several 
reasons. I think that the "lucene" prefix can be assumed from the name of the 
schema which holds this procedure. I'd also like the procedure names to express 
the fact that luceneUpdateDocument() refreshes the index created by 
indexTable(). So I'd recommend something like updateDocumentIndex(), akin to 
createDocumentIndex().

Some more thoughts follow:

4) In my experience, every insert function needs to be matched by corresponding 
update and delete functions. Developers expect that. This tool provides an 
insert function (createDocumentIndex()) and a corresponding update function 
(updateDocumentIndex()) but no corresponding delete function. The delete 
function is really important as developers hack out their schemas in the 
laboratory. So I recommend adding a dropDocumentIndex() procedure. I understand 
your reservations about deleting a whole directory, but I think that the first 
enhancement request we'll get is "give me a way to delete these things."

5) The non-transactional behavior of these procedures needs to be clearly 
understood by users. That's probably a documentation issue. But users need to 
understand that they can't rollback some of the important effects of calls to 
createDocumentIndex() and dropDocumentIndex().

6) Developers hacking out a schema in the laboratory will also want tools for 
introspecting which columns have been indexed and how current the indexes are. 
Maybe the best solution would be a view wrapping a table function which, in 
turn, exposes metadata from Lucene and/or the file system. If that's not 
possible, a table like the following could be useful:

LuceneSupport.documentIndexes
(
    id int generated always as identity,
    tableID char( 36 ) not null,
    columnNumber int,
    lastupdated timestamp,
    unique( tableID, columnNumber )  
);

I prefer the table function solution because it makes it harder for the user to 
mess up and accidentally delete this metadata.

7) At this point, I don't see a need for the syntactic sugar of new SQL 
statements. I think that the optional tool approach is fine for this first 
increment. I recommend deferring any parser work until later, after we've 
cleared up the transactional consistency issues.

Thanks!
-Rick


> How to integrate Derby with Lucene API?
> ---------------------------------------
>
>                 Key: DERBY-590
>                 URL: https://issues.apache.org/jira/browse/DERBY-590
>             Project: Derby
>          Issue Type: Improvement
>          Components: Documentation, SQL
>            Reporter: Abhijeet Mahesh
>              Labels: derby_triage10_11
>         Attachments: lucene_demo.diff
>
>
> In order to use derby with lucene API what should be the steps to be taken? 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (DERBY-590) How to integrate Derby with Lucene API?

Reply via email to