[jira] Updated: (LUCENE-1336) Distributed Lucene using Hadoop RPC based RMI with dynamic classloading

Jason Rutherglen (JIRA) Fri, 18 Jul 2008 11:45:24 -0700

     [ 
https://issues.apache.org/jira/browse/LUCENE-1336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jason Rutherglen updated LUCENE-1336:
-------------------------------------

    Attachment: lucene-1336.patch

lucene-1336.patch

- Added Lucene specific classes that allow creating indexes, updating indexes, 
searching
- Distributed garbage collection of objects on the server using leases.  Used 
with Searchables that are no longer 
referenced on the client.  The server keeps track of references to the server 
object (if desired) and cleans up when the count == 0.
- Analyzer is Serializable
- TestLuceneServer, TestLuceneClient test cases

Todo:
- Add to LuceneClient using distributed events (callbacks) notification of new 
IndexReader on the server.  This way all interested parties always have a 
remote reference to the latest index version. 

Some of the interfaces:

{code}
public interface IndexService extends Remote {
  public static class IndexVersion implements Serializable {
    public static final long serialVersionUID = 1l;
    public long generation;
  }
  
  public static interface Operation {
  }
  
  public static class Add implements Serializable, Operation {
    public static final long serialVersionUID = 1l;
    public Analyzer analyzer;
    public Document document;
  }
  
  public static class Update implements Serializable, Operation {
    public static final long serialVersionUID = 1l;
    public Document document;
    public Term term;
    public Analyzer analyzer;
  }
  
  public static class Delete implements Serializable, Operation {
    public static final long serialVersionUID = 1l;
    public Query query;
    public Term term;
  }
  
  public SearchableService reopen() throws Exception;
  public void close() throws Exception;
  public IndexInfo getIndexInfo() throws Exception;
  
  /**
   * Executes batch of index changing operations (add, update, or delete) 
   * @param operations
   * @throws Exception
   */
  public void execute(Operation[] operations) throws Exception;
  public void addDocument(Document document, Analyzer analyzer) throws 
Exception;
  public void updateDocument(Term term, Document document, Analyzer analyzer) 
throws Exception;
  public void deleteDocuments(Term term) throws Exception;
  public void deleteDocuments(Query query) throws Exception;
  public void flush() throws Exception;
}
{code}

{code}
public interface SearchableService extends Searchable {
  public IndexVersion getIndexVersion() throws Exception;
  public Document[] docs(int[] docs, FieldSelector fieldSelector) throws 
CorruptIndexException, IOException;
}
{code}

{code}
public interface IndexManagerService {
  
  public static class IndexInfo implements Serializable {
    public static final long serialVersionUID = 1l;
    public String name;
    public String serviceName;
    public long length;
    public IndexSettings indexSettings;
  }
  
  public static class IndexSettings implements Serializable {
    public static final long serialVersionUID = 1l;
    public Analyzer defaultAnalyzer;
    public int maxFieldLength;
    public Double ramBufferSizeMB;
  }

  public IndexService createIndex(String name, IndexSettings indexSettings) 
throws Exception;
  public IndexInfo[] getIndexInfos() throws Exception;
  public void deleteIndex(String name) throws Exception;
}
{code}

> Distributed Lucene using Hadoop RPC based RMI with dynamic classloading
> -----------------------------------------------------------------------
>
>                 Key: LUCENE-1336
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1336
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: contrib/*
>    Affects Versions: 2.3.1
>            Reporter: Jason Rutherglen
>            Priority: Minor
>         Attachments: lucene-1336.patch, lucene-1336.patch
>
>
> Hadoop RPC based RMI system for use with Lucene Searchable.  Keeps the 
> application logic on the client side with removing the need to deploy 
> application logic to the Lucene servers.  Removes the need to provision new 
> code to potentially hundreds of servers for every application logic change.  
> The use case is any deployment requiring Lucene on many servers.  This system 
> provides the added advantage of allowing custom Query and Filter classes (or 
> other classes) to be defined on for example a development machine and 
> executed on the server without deploying the custom classes to the servers 
> first.  This can save a lot of time and effort in provisioning, restarting 
> processes.  In the future this patch will include an IndexWriterService 
> interface which will enable document indexing.  This will allow subclasses of 
> Analyzer to be dynamically loaded onto a server as documents are added by the 
> client.
> Hadoop RPC is more scalable than Sun's RMI implementation because it uses non 
> blocking sockets.  Hadoop RPC is also far easier to understand and customize 
> if needed as it is embodied in 2 main class files 
> org.apache.hadoop.ipc.Client and org.apache.hadoop.ipc.Server.  
> Features include automatic dynamic classloading.  The dynamic classloading 
> enables newly compiled client classes inheriting core objects such as Query 
> or Filter to be used to query the server without first deploying the code to 
> the server.  
> Using RMI dynamic classloading is not used in practice because it is hard to 
> setup, requiring placing the new code in jar files on a web server on the 
> client.  Then requires custom system properties to be setup as well as Java 
> security manager configuration.  
> The dynamic classloading in Hadoop RMI for Lucene uses RMI to load the 
> classes.  Custom serialization and deserialization manages the classes and 
> the class versions on the server and client side.  New class files are 
> automatically detected and loaded using ClassLoader.getResourceAsStream and 
> so this system does not require creating a JAR file.  The use of the same 
> networking system used for the remote method invocation is used for the 
> loading classes over the network.  This removes the necessity of a separate 
> web server dedicated to the task and makes deployment a few lines of code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Updated: (LUCENE-1336) Distributed Lucene using Hadoop RPC based RMI with dynamic classloading

Reply via email to