On 6/21/06, Marvin Humphrey <[EMAIL PROTECTED]> wrote:
Greets,

There are some portability problems that may not be worth solving.

On some Crays, ints, longs and pointers are all 8 bytes (the ILP64
format).  I propose not supporting any machine where we can't
guarantee that lucy_i8_t is 1 byte and lucy_i32_t is 4 bytes.

A second esoteric problem is machines that don't use IEEE 754 for
floats: <http://www.codeproject.com/tools/libnumber.asp>.  I think
that the norms-encoding routine will break on such machines.  That
ought to be the only problem, I think but it's gnarly enough I think
we should just decide not to support those boxes.

Sounds fine to me. If someone needs Lucy to work on one of those
boxes, it will just be a simple matter of them supplying us with
float2byte and byte2float methods.

Another wrinkle is large file support.  Machines that don't support
large files are growing scarcer by the day, but eventually, somebody
who has one will want to use Lucy.  Index files can get pretty big.

Is it even possible for a machine to have large file support and not
provide a 64-bit integer?  The only thing Lucene ever uses 64-bit
integers for is file pointers.  KinoSearch takes advantage of this in
a weird way -- it uses doubles wherever Lucene uses Java longs.  I
did it that way because Perl always provides support for doubles, but
64-bit integer support takes a special compile and generally doesn't
work very well.  The 52-bit mantissa in an IEEE 754 double is more
than enough for any file pointer.  But when I made that call, I was
using native Perl filehandles as InStream objects; KinoSearch doesn't
do that anymore, and I don't think we should go the doubles-as-file-
pointers route with Lucy (even though it Just Works).

I'm inclined to require both large file support and 64-bit integers
for Lucy.  What say?

I'm not sure about large file support. You've looked into it more than
I have but I do think 64 bit integers are a must.

[aside:What I'm doing in Ferret is storing all file pointers as off_t.
As well as read/write_vint methods I have read/write_voff_t. The only
time I use 64-bit integers (ie always 64-bit unlike off_t which could
be 32-bit) is when I need to write a fixed byte size pointer like in
the fields and term_vectors index files. I've only just implemented
this but it seems to be working.]

Reply via email to