[
https://issues.apache.org/jira/browse/DERBY-590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13813121#comment-13813121
]
Rick Hillegas commented on DERBY-590:
-------------------------------------
Thanks for the second rev of the patch, Andrew. I have a couple comments under
the following topics:
o Build
o Localization
o Security
o Misc
I applied the patch, copied the Lucene jar files into tools/java, and
successfully ran LuceneSupportTest. In addition, I ran some more experiments
with a script attached at the end of this comment.
Thanks,
-Rick
-------------- BUILD ----------------
This optional tool relies on the following Lucene jar files. The user
documentation should mention that these are necessary in order to use this tool:
A tools/java/lucene-analyzers-common-4.5.0.jar
A tools/java/lucene-queryparser-4.5.0.jar
A tools/java/lucene-core-4.5.0.jar
I see that you intend to check the Lucene jar files into the Derby source tree
so that this code is always built and tested--even though I also see that you
can build Derby cleanly without the Lucene jar files. I agree with your
decision. We have spent considerable effort making it possible for all
developers to build and test the entire Derby product, and I'd like to stay
with that model. I understand that the Lucene jar files are big. They are an
order of magnitude larger than any other jar files checked into Derby. But I
don't think that including them in our source distributions is burdensome.
I'm not clear on Lucene's compatibility guarantees:
B1) Is there some reason that you chose jar files from Lucene 4.5.0 rather than
the latest 4.5.1 release?
B2) Our documentation should state which versions of Lucene we support. We
should make sure that we verify those claims during release testing.
-------------- LOCALIZATION ----------------
L1) LuceneListIndexesVTI and LuceneQueryVTI raise SQLExceptions with hard-coded
error strings. These should probably be replaced with localizable error
strings. To see how to do this with optional tools, you can look for the error
handle OT_BadLoadUnloadArgs in ForeignDBViews.
L2) In addition, when LuceneQueryVTI.getRawColumn() catches an IOException, it
should use that IOException as the cause of the SQLException that's thrown
afterward.
-------------- SECURITY ----------------
I think that this tool should run comfortably when a Java security manager is
installed. I recommend the following changes:
S1) LuceneSupport.dropIndex() should do its File operations inside privilege
blocks.
S2) LuceneSupportTest should not disable the security manager.
We need to give some thought to the SQL permissions needed to use this Lucene
support. I have observed the following:
S3) Only the DBO can load/unload this tool. That seems fine to me.
S4) Ordinary users don't have permission to create/drop indexes. Is that too
restrictive? It would be friendlier if anyone could create/drop indexes on any
column they own. That is, on any column in a table in a schema that they own.
S5) Ordinary users don't have permission to use the luceneQuery() table
function.
S6) ... but if EXECUTE privilege on luceneQuery() is granted to PUBLIC, then
people can view any indexed data. This seems overbroad. If Fred doesn't have
SELECT privilege on lucenetest.titles.title, then Fred shouldn't be able to see
the contents of that column by running luceneQuery(). A poor man's solution to
this would involve making luceneQuery() issue a dummy select against the
indexed column first, just to make sure that the user has permission to see
that data.
S7) I have some misgivings about how this Lucene support will play with
encrypted databases. The conglomerates will be encrypted but a great deal of
information will leak out if the Lucene indexes are not encrypted. At a
minimum, we should file a follow-on JIRA to collect thoughts/solutions related
to this vulnerability. If we don't address this vulnerability, then we should
at least describe it in our user documentation.
-------------- MISC ----------------
M1) LuceneListIndexesVTI and LuceneSupportTest need Apache license headers.
M2) LuceneQueryVTI imports RestrictedVTI but doesn't implement it. Are you
planning to make LuceneQueryVTI a RestrictedVTI in a future rev?
M3) I like the isolation of the Lucene indexes in a separate "lucene"
subdirectory of the database. I see that createIndex() creates a subdirectory
with a name made out of the schema, table, and column arguments. There are
probably some tricky edge cases involving delimited identifiers and maybe some
security vulnerabilities involved in exposing these names. The edge cases and
hypothetical security issues might be eliminated by giving these subdirectories
less friendly names like $tableID_$columnNumber.
-------------- TEST SCRIPT ----------------
connect 'jdbc:derby:db1;create=true;user=dbo';
call syscs_util.syscs_create_user( 'DBO', 'dbo_password' );
call syscs_util.syscs_create_user( 'LUCENETEST', 'lucenetest_password' );
call syscs_util.syscs_create_user( 'FRED', 'fred_password' );
-- shutdown in order to enable NATIVE authentication
connect 'jdbc:derby:db1;shutdown=true';
connect 'jdbc:derby:db1;user=lucenetest;password=lucenetest_password' as
lucenetest;
create table titles (ID int generated always as identity, ISBN varchar(16),
PRINTISBN varchar(16), title varchar(1024), subtitle varchar(1024), author
varchar(1024), series varchar(1024), publisher varchar(1024), collections
varchar(128), collections2 varchar(128));
insert into titles (ISBN, PRINTISBN, TITLE, SUBTITLE, AUTHOR, SERIES,
PUBLISHER, COLLECTIONS, COLLECTIONS2) values
('9765087650324','9765087650324','The Grapes Of Wrath','The Great Depression in
Oklahoma','John Steinbeck','Noble Winners','The Viking Press','National Book
Award','Pulitzer Prize');
insert into titles (ISBN, PRINTISBN, TITLE, SUBTITLE, AUTHOR, SERIES,
PUBLISHER, COLLECTIONS, COLLECTIONS2) values
('6754278542987','6754278542987','Identical: Portraits of Twins','Best Photo
Book 2012 by American Photo Magazine','Martin
Schoeller','Portraits','teNeues','Photography','');
insert into titles (ISBN, PRINTISBN, TITLE, SUBTITLE, AUTHOR, SERIES,
PUBLISHER, COLLECTIONS, COLLECTIONS2) values
('2747583475882','2747583475882','Vines, Grapes, and Wines','The wine
drinker''s guide to grape varieties','Jancis Robinson','Reference','Alfred A.
Knopf','Wine','');
insert into titles (ISBN, PRINTISBN, TITLE, SUBTITLE, AUTHOR, SERIES,
PUBLISHER, COLLECTIONS, COLLECTIONS2) values
('4356123483483','4356123483483','A Tale of Two Cities','A fictional account of
events leading up to the French revolution','Charles
Dickens','Classics','Chapman & Hall','Fiction','Social Criticism');
-- as expected, this user doesn't have permission to load the tool
call syscs_util.syscs_register_tool('luceneSupport',true);
connect 'jdbc:derby:db1;user=dbo;password=dbo_password' as dbo;
-- succeeds because dbo has permission to load optional tools
call syscs_util.syscs_register_tool('luceneSupport',true);
set connection lucenetest;
-- fails due to lack of permission. probably shouldn't fail.
call LuceneSupport.createIndex('lucenetest','titles','title','id');
set connection dbo;
-- succeeds
call LuceneSupport.createIndex('lucenetest','titles','title','id');
set connection lucenetest;
-- fails due to lack of permission. probably shouldn't fail.
select * from table ( LuceneSupport.luceneQuery('grapes', 'lucenetest.titles',
'title', 0) ) luceneResults;
set connection dbo;
-- succeeds
select * from table ( LuceneSupport.luceneQuery('grapes', 'lucenetest.titles',
'title', 0) ) luceneResults;
grant execute on function LuceneSupport.luceneQuery to public;
set connection lucenetest;
-- works now, after dbo lets everyone run the luceneQuery table function
select * from table ( LuceneSupport.luceneQuery('grapes', 'lucenetest.titles',
'title', 0) ) luceneResults;
connect 'jdbc:derby:db1;user=fred;password=fred_password' as dbo;
-- it's unfortunate that fred can view data in a table he doesn't have SELECT
access to
select * from table ( LuceneSupport.luceneQuery('grapes', 'lucenetest.titles',
'title', 0) ) luceneResults;
-- as expected, the following query fails due to lack of permission
select title from lucenetest.titles where 1=2;
set connection dbo;
select id, schemaname, tablename, columnname from table (
LuceneSupport.listIndexes('', '') ) listindexes;
> How to integrate Derby with Lucene API?
> ---------------------------------------
>
> Key: DERBY-590
> URL: https://issues.apache.org/jira/browse/DERBY-590
> Project: Derby
> Issue Type: Improvement
> Components: Documentation, SQL
> Reporter: Abhijeet Mahesh
> Labels: derby_triage10_11
> Attachments: lucene_demo.diff, lucene_demo_2.diff
>
>
> In order to use derby with lucene API what should be the steps to be taken?
--
This message was sent by Atlassian JIRA
(v6.1#6144)