[jira] [Commented] (SOLR-7734) MapReduce Indexer can error when using collection

Gregory Chanan (JIRA) Tue, 14 Jul 2015 16:39:21 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627281#comment-14627281
 ]


Gregory Chanan commented on SOLR-7734:
--------------------------------------

{code}+import com.google.common.base.Charsets;{code}
This is necessary?

{code}+ "may be downloaded from this ZooKeeper ensemble."));{code}
It's "may" because you might have specified --use-zk-solrconfig.xml?  And you 
want to leave it vague because the help on --use-zk-solrconfig.xml is 
suppressed?  This seems more confusing to me than just specifying everything in 
the help.

{code}
+        if (!options.useZkSolrConfig) {
+          // replace downloaded solrconfig.xml with embedded one
+          InputStream source = 
MapReduceIndexerTool.class.getResourceAsStream("/solrconfig.indexer.xml");
+          FileOutputStream destination = new 
FileOutputStream(getSolrConfig(tmpSolrHomeDir));
+          ByteStreams.copy(source, destination);
+         destination.close();
+         source.close();
+        }
{code}
The spacing looks off here.  Maybe better to close everything in a finally as 
well.

{code}
+      <solr-jarify-filesets>
+        <fileset dir="src/resources" />
+      </solr-jarify-filesets>
{code}
When i try to run "ant jar" on the map-reduce contrib I get 
"solr/contrib/map-reduce/src/resources does not exist" -- did you mean for 
solrconfig.indexer.xml to be there?

{code}
+  <luceneMatchVersion>4.10.3</luceneMatchVersion>
{code}
Why the old version?  Should this be 6.0.0 for trunk, 5.something for 
branch_5x?  (I assume you want it in both, tell me if that's incorrect)

{code}
To enable dynamic schema REST APIs, use the following for <schemaFactory>:
+
+       <schemaFactory class="ManagedIndexSchemaFactory">
+         <bool name="mutable">true</bool>
+         <str name="managedSchemaResourceName">managed-schema</str>
+       </schemaFactory>
{code}
Does this work with managed  schemas?  What about if the resource name isn't 
the default?

{code}
+  <!-- JMX
+
+       This example enables JMX if and only if an existing MBeanServer
+       is found, use this if you want to configure JMX through JVM
+       parameters. Remove this to disable exposing Solr configuration
+       and statistics to JMX.
+
+       For more details see http://wiki.apache.org/solr/SolrJmx
+    -->
+  <jmx />
{code}
Do we want jmx?  Is it even possible to use in an MR job?

{code}+  <requestDispatcher handleSelect="false" >
+    <!-- Request Parsing{code}
Do we need this whole section?

About testing: I assume the existing tests now use the new (non-overwrite 
behavior).  What about adding a test for the new option 
(--use-zk-solrconfig.xml).  Maybe something simple like have your own update 
chain that adds a field/value that you expect to see.  And possibly the 
converse, where you add an update.chain and check that the new behavior is 
actually working, i.e. that it doesn't use the solrconfig in zk.

> MapReduce Indexer can error when using collection
> -------------------------------------------------
>
>                 Key: SOLR-7734
>                 URL: https://issues.apache.org/jira/browse/SOLR-7734
>             Project: Solr
>          Issue Type: Bug
>          Components: contrib - MapReduce
>    Affects Versions: 5.2.1
>            Reporter: Mike Drob
>            Assignee: Gregory Chanan
>             Fix For: 5.3, Trunk
>
>         Attachments: SOLR-7734.patch, SOLR-7734.patch, SOLR-7734.patch
>
>
> When running the MapReduceIndexerTool, it will usually pull a 
> {{solrconfig.xml}} from ZK for the collection that it is running against. 
> This can be problematic for several reasons:
> * Performance: The configuration in ZK will likely have several query 
> handlers, and lots of other components that don't make sense in an 
> indexing-only use of EmbeddedSolrServer (ESS).
> * Classpath Resources: If the Solr services are using some kind of additional 
> service (such as Sentry for auth) then the indexer will not have access to 
> the necessary configurations without the user jumping through several hoops.
> * Distinct Configuration Needs: Enabling Soft Commits on the ESS doesn't make 
> sense. There's other configurations that 
> * Update Chain Behaviours: I'm under the impression that UpdateChains may 
> behave differently in ESS than a SolrCloud cluster. Is it safe to depend on 
> consistent behaviour here?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-7734) MapReduce Indexer can error when using collection

Reply via email to