Discouraging setting timestamps seems to make sense.  In our situation we 
bulk import ever 'x' minutes and if for some reason one of the older imports 
fails and has to be restarted after a later import happens we would like to 
import the older records at the appropriate timestamp before the timestamp of 
the later import.  It sounds like that may be one of the situations that could 
trigger some internals edges cases, correct?  

   Also, just as a separate note since the timestamp is set in the Mapper if 
the import has more than one mapper I wouldn't get a consistent timestamp for 
all the records for a given load.  For our use case it is helpful to be able to 
identify all records associated with a given import.

   I went ahead and added a JIRA ( HBASE-3705 ) and uploaded the basic patch.  
I'll update the documentation as well.  

   Thanks

   Andy

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Jean-Daniel 
Cryans
Sent: Monday, March 28, 2011 10:51 AM
To: [email protected]
Subject: Re: passing timestamp into importtsv...

I have two thoughts about it:

1- We generally discourage users setting their own timestamps since it
messes with the internals in some edge cases. Adding this
functionality goes against that.
2- Almost every interface we offer lets users set their own
timestamps, so to be more consistent we should indeed offer it for
importtsv.

So I think you should open a jira and post your patch.

J-D

On Mon, Mar 28, 2011 at 9:36 AM, Andy Sautins
<[email protected]> wrote:
>
>   We have been having a lot of success using the importtsv utility to load 
> data into HBase as described in the wiki 
> (http://hbase.apache.org/bulk-loads.html).  The one issue we have run into is 
> that we would like to assign a specific timestamp to the records associated 
> with the import.  The current ImportTsv.java class sets the timestamp to the 
> current time ( ts = System.currentTimeMillis() ).  We have a patch we have 
> been using that if a system property is  set ( importtsv.timestamp ) to set 
> the timestamp from the property.  If the property is not set to use the 
> current time.  This has been very helpful for us and allows for  more control 
> in setting the timestamps for imported records.
>
>   My question is is this useful functionality in general?  If so I'd be happy 
> to submit a JIRA and patch with the appropriate changes.
>
>   Thanks
>
>   Andy
>

Reply via email to