I checked all possibilities we have, also by reading source code of JDK:

 

We can have „one migration strategy”:

-          We forbid writing with OutputStream, so we won’t produce new 
ASCII-only files and we write new files as UTF-8. Older Solr versions no longer 
can read those files, but this is not a problem.

-          We forbid reading with InputStream, because that one can no longer 
read files written as UTF-8 without escapes.

-          We allow only reading by Reader and the Reader must be UTF-8 -> this 
allows to still load old properties files loaded by older Solr versions 
(because when they were written in the old format, the reader code allows also 
Unicode escapes with \u, so “old-style” files are still parseable.

 

So this would allow us to enforce the Reader/Writer API, if the charset to be 
used is UTF-8

 

Uwe

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

 <http://www.thetaphi.de/> http://www.thetaphi.de

eMail: u...@thetaphi.de

 

From: Uwe Schindler [mailto:u...@thetaphi.de] 
Sent: Friday, July 12, 2013 2:09 PM
To: dev@lucene.apache.org
Subject: RE: svn commit: r1502468 - in /lucene/dev/trunk/solr/core/src: 
java/org/apache/solr/core/CorePropertiesLocator.java 
test/org/apache/solr/core/TestSolrXmlPersistor.java

 

Hi,

 

There are several reasons why I added this:

The biggest issue is consistence: we should decide to use either one or the 
other, but not mixed. The recent commits on Solr were wrong in that case, 
because the wrote the files using a defined reader but read it using the 
InputStream and similar problems. The commit was to prevent this.

 

I agree, both properties files formats are defined in JDK6 docs, but the 
“original” one defined by Sun once a while back was specified to be: 
“ISO-8859-1 and Unicode escapes”. And almost all the properties files out there 
are using this encoding, including all of the ones included in the JDK (see 
your JDK folder, all properties files there are in this format, one example: 
$JAVA_HOME/jre/lib/deploy/messages_ja.properties for a crazy one). I agree, for 
newer developments one should use a newer format, but the problem we have in 
Solr is that we are no longer able to read old properties files – which were 
always written in the Sun original specification format (see 
http://docs.oracle.com/javase/1.4.2/docs/api/java/util/Properties.html for the 
format).

 

The commit was just there to make it consistent and enforce consistence – 
that’s all. We can discuss about that, maybe only enforce this for Solr. For 
the core load/save the properties files are written by machines only, so nobody 
edits them by hand. Lucene does not use properties files, that was the hole 
thing.

 

Robert, please, before complaining again – don’t think about fucking Unicode 
only, think about standards defined long time ago (and the properties file 
format is one of those). This is just for consistency. The reasoning behind the 
whole thing is similar to my complaints about XML: XML also needs to be read 
through an InputStream because they are binary and charsetless (charset is part 
of the fileformat; application/xml and not text/xml is the MIME type). 
Properties files are somehow also binary J and were defined in the past to use 
Unicode Escapes and ISO-8859-1 charset 
(http://docs.oracle.com/javase/1.4.2/docs/api/java/util/Properties.html - 
that’s the oldest one I got).

 

In general we should not use properties files at all, so I would personally 
forbid them completely, but Solr used them for longer time now.

 

Uwe

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

 <http://www.thetaphi.de/> http://www.thetaphi.de

eMail: u...@thetaphi.de

 

From: Robert Muir [mailto:rcm...@gmail.com] 
Sent: Friday, July 12, 2013 1:41 PM
To: dev@lucene.apache.org
Subject: Re: svn commit: r1502468 - in /lucene/dev/trunk/solr/core/src: 
java/org/apache/solr/core/CorePropertiesLocator.java 
test/org/apache/solr/core/TestSolrXmlPersistor.java

 

Where is this 'officially defined one' according to JVM spec? Please give a 
reference, as the Reader api is just as well defined as the InputStream API.

If we want consistency, I want the Reader one!
Just so you know, i totally disagree with this. I refuse to use native2ascii. 
I think these commits banning the Reader API should be reverted.

On Fri, Jul 12, 2013 at 4:53 AM, Uwe Schindler <u...@thetaphi.de> wrote:

Yes, and be sure to do the opposite when reading properties files! Otherwise it 
is not consistent and loading/saving and exceptions may happen.

I am working on forbidden-apis to ensure we are consistent everywhere. The 
binary properties file format is the officially defined one according to JVM 
spec.


Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -----Original Message-----

> From: Alan Woodward [mailto:a...@flax.co.uk]
> Sent: Friday, July 12, 2013 10:47 AM
> To: dev@lucene.apache.org
> Subject: Re: svn commit: r1502468 - in /lucene/dev/trunk/solr/core/src:
> java/org/apache/solr/core/CorePropertiesLocator.java
> test/org/apache/solr/core/TestSolrXmlPersistor.java
>
> Does this fix it?
>
> @@ -79,9 +76,7 @@ public class CorePropertiesLocator implements
> CoresLocator {
>      OutputStream os = null;
>      try {
>        os = new FileOutputStream(propfile);
> -      Writer writer = new OutputStreamWriter(os, Charsets.UTF_8);
> -      p.store(writer, "Written by CorePropertiesLocator on " + new Date());
> -      writer.close();
> +      p.store(os, "Written by CorePropertiesLocator on " + new Date());
>      }
>      catch (IOException e) {
>
> Alan Woodward
> www.flax.co.uk
>
>
> On 12 Jul 2013, at 09:35, Uwe Schindler wrote:
>
> > Hi,
> >
> > you have tob e careful: If you store properties with a writer but load it 
> > with
> InputStream, the code is different. Properties files have a defined charset of
> ISO-8859-1:
> >
> > "The load(Reader) / store(Writer, String) methods load and store
> properties from and to a character based stream in a simple line-oriented
> format specified below. The load(InputStream) / store(OutputStream,
> String) methods work the same way as the load(Reader)/store(Writer,
> String) pair, except the input/output stream is encoded in ISO 8859-1
> character encoding. Characters that cannot be directly represented in this
> encoding can be written using Unicode escapes as defined in section 3.3 of
> The Java™ Language Specification; only a single 'u' character is allowed in an
> escape sequence. The native2ascii tool can be used to convert property files
> to and from other character encodings."
> >
> > So be sure to be consistent when loading/saving! If we previously (in older
> Solr version) used the InputStream methods to load/store core props, we
> should use ISO-8859-1 to load/store to be compatible with older versions!
> >
> > Uwe
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: u...@thetaphi.de
> >
> >
> >> -----Original Message-----
> >> From: romseyg...@apache.org [mailto:romseyg...@apache.org]
> >> Sent: Friday, July 12, 2013 10:26 AM
> >> To: comm...@lucene.apache.org
> >> Subject: svn commit: r1502468 - in /lucene/dev/trunk/solr/core/src:
> >> java/org/apache/solr/core/CorePropertiesLocator.java
> >> test/org/apache/solr/core/TestSolrXmlPersistor.java
> >>
> >> Author: romseygeek
> >> Date: Fri Jul 12 08:25:36 2013
> >> New Revision: 1502468
> >>
> >> URL: http://svn.apache.org/r1502468
> >> Log:
> >> SOLR-4914: Close OutputStreamWriter properly, use
> >> System.getProperty("line.separator") instead of \n
> >>
> >> Fixes Windows test failures.
> >>
> >> Modified:
> >>
> >> lucene/dev/trunk/solr/core/src/java/org/apache/solr/core/CoreProperti
> >> esL
> >> ocator.java
> >>
> >> lucene/dev/trunk/solr/core/src/test/org/apache/solr/core/TestSolrXmlP
> >> ersi
> >> stor.java
> >>
> >> Modified:
> >> lucene/dev/trunk/solr/core/src/java/org/apache/solr/core/CoreProperti
> >> esL
> >> ocator.java
> >> URL:
> >> http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/
> >> apa
> >>
> che/solr/core/CorePropertiesLocator.java?rev=1502468&r1=1502467&r2=15
> >> 0
> >> 2468&view=diff
> >>
> ==========================================================
> >> ====================
> >> ---
> >> lucene/dev/trunk/solr/core/src/java/org/apache/solr/core/CoreProperti
> >> esL
> >> ocator.java (original)
> >> +++
> lucene/dev/trunk/solr/core/src/java/org/apache/solr/core/CoreProp

> >> +++ ert iesLocator.java Fri Jul 12 08:25:36 2013

> >> @@ -20,6 +20,7 @@ package org.apache.solr.core;  import
> >> com.google.common.base.Charsets;  import
> >> com.google.common.collect.Lists;  import
> >> org.apache.solr.common.SolrException;
> >> +import org.apache.solr.util.IOUtils;
> >> import org.slf4j.Logger;
> >> import org.slf4j.LoggerFactory;
> >>
> >> @@ -27,6 +28,7 @@ import java.io.File; import
> >> java.io.FileInputStream; import java.io.FileOutputStream; import
> >> java.io.IOException;
> >> +import java.io.OutputStream;
> >> import java.io.OutputStreamWriter;
> >> import java.io.Writer;
> >> import java.util.Date;
> >> @@ -56,14 +58,7 @@ public class CorePropertiesLocator imple
> >>         throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
> >>                                 "Could not create a new core in " + 
> >> cd.getInstanceDir()
> >>                               + "as another core is already defined 
> >> there");
> >> -      try {
> >> -        Properties p = buildCoreProperties(cd);
> >> -        Writer writer = new OutputStreamWriter(new
> >> FileOutputStream(propFile), Charsets.UTF_8);
> >> -        p.store(writer, "Written by CorePropertiesLocator on " + new
> Date());
> >> -      }
> >> -      catch (IOException e) {
> >> -        logger.error("Couldn't persist core properties to {}: {}",
> >> propFile.getAbsolutePath(), e);
> >> -      }
> >> +      writePropertiesFile(cd, propFile);
> >>     }
> >>   }
> >>
> >> @@ -75,14 +70,25 @@ public class CorePropertiesLocator imple
> >>   public void persist(CoreContainer cc, CoreDescriptor... coreDescriptors) 
> >> {
> >>     for (CoreDescriptor cd : coreDescriptors) {
> >>       File propFile = new File(new File(cd.getInstanceDir()),
> >> PROPERTIES_FILENAME);
> >> -      try {
> >> -        Properties p = buildCoreProperties(cd);
> >> -        Writer writer = new OutputStreamWriter(new
> >> FileOutputStream(propFile), Charsets.UTF_8);
> >> -        p.store(writer, "Written by CorePropertiesLocator on " + new
> Date());
> >> -      }
> >> -      catch (IOException e) {
> >> -        logger.error("Couldn't persist core properties to {}: {}",
> >> propFile.getAbsolutePath(), e);
> >> -      }
> >> +      writePropertiesFile(cd, propFile);
> >> +    }
> >> +  }
> >> +
> >> +  private void writePropertiesFile(CoreDescriptor cd, File propfile)  {
> >> +    Properties p = buildCoreProperties(cd);
> >> +    OutputStream os = null;
> >> +    try {
> >> +      os = new FileOutputStream(propfile);
> >> +      Writer writer = new OutputStreamWriter(os, Charsets.UTF_8);
> >> +      p.store(writer, "Written by CorePropertiesLocator on " + new 
> >> Date());
> >> +      writer.close();
> >> +    }
> >> +    catch (IOException e) {
> >> +      logger.error("Couldn't persist core properties to {}: {}",
> >> propfile.getAbsolutePath(), e);
> >> +    }
> >> +    finally {
> >> +      if (os != null)
> >> +        IOUtils.closeQuietly(os);
> >>     }
> >>   }
> >>
> >>
> >> Modified:
> >> lucene/dev/trunk/solr/core/src/test/org/apache/solr/core/TestSolrXmlP
> >> ersi
> >> stor.java
> >> URL:
> >> http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/test/org/
> >> apa
> >>
> che/solr/core/TestSolrXmlPersistor.java?rev=1502468&r1=1502467&r2=150
> >> 2
> >> 468&view=diff
> >>
> ==========================================================
> >> ====================
> >> ---
> >> lucene/dev/trunk/solr/core/src/test/org/apache/solr/core/TestSolrXmlP
> >> ersi
> >> stor.java (original)
> >> +++ lucene/dev/trunk/solr/core/src/test/org/apache/solr/core/TestSolr

> >> +++ Xml Persistor.java Fri Jul 12 08:25:36 2013

> >> @@ -70,8 +70,8 @@ public class TestSolrXmlPersistor {
> >>
> >>     SolrXMLCoresLocator persistor = new SolrXMLCoresLocator(new
> >> File("testfile.xml"), solrxml, null);
> >>     assertEquals(persistor.buildSolrXML(cds),
> >> -          "<solr><cores>\n"
> >> -        + "    <core name=\"testcore\" instanceDir=\"instance/dir/\"/>\n"
> >> +          "<solr><cores>" + SolrXMLCoresLocator.NEWLINE
> >> +        + "    <core name=\"testcore\" instanceDir=\"instance/dir/\"/>" +
> >> SolrXMLCoresLocator.NEWLINE
> >>         + "</cores></solr>");
> >>   }
> >>
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
> > additional commands, e-mail: dev-h...@lucene.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
> commands, e-mail: dev-h...@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

 

Reply via email to