D2945: state: add a magicheader to each state file

2020-01-24 Thread baymax (Baymax, Your Personal Patch-care Companion)
This revision now requires changes to proceed.
baymax added a comment.
baymax requested changes to this revision.


  There seems to have been no activities on this Diff for the past 3 Months.
  
  By policy, we are automatically moving it out of the `need-review` state.
  
  Please, move it back to `need-review` without hesitation if this diff should 
still be discussed.
  
  :baymax:need-review-idle:

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D2945/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D2945

To: pulkit, #hg-reviewers, baymax
Cc: jeffpc, yuja, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2945: state: add a magicheader to each state file

2018-06-14 Thread jeffpc (Jeff Sipek)
jeffpc added a comment.


  In https://phab.mercurial-scm.org/D2945#58586, @pulkit wrote:
  
  > There is a suggestion from @jeffpc to have a plain text header in 
statefiles like `HGStateFile` so that user can grep for them or look into files 
and know that they are state files. How does that sound?
  >  (@jeffpc I am not sure I explained this correctly, it will be great if you 
can chime in.)
  
  
  I don't really care about the actual string value, I just think it is a good 
storage design practice to have every on-disk structure contain a magic value 
of some sort.  (Beware, I have a storage background and therefore opinions 
about these things... ;) )  This makes it easier to do all sort of data 
recovery should the need arise.  E.g.,
  
  - if you know that a random file (e.g., from lost+found) is a hg state file, 
you can ignore/delete it and just reclone your repos
  - hg can detect some corruption better, and depending on the state file 
regenerate it or error out
  - hg developers can more easily sort out what's what when looking at 
buffers/files/etc.
  - as an added bonus, the magic can be used for format versioning
  
  FWIW, the three major things that I consider essential to good on-disk 
structures:
  
  1. magic numbers - see above
  2. checksums - possibly two checksums: one for the header and one for the 
payload
  3. back-pointers - see below
  
  Back-pointers allow you to tie the file/buffer into the bigger context - be 
it which state file it is, or which repository it belongs to.  In this case a 
state file's back-pointer could be something like the repo-id + state-file-name 
stored in the state file's header.  (I realize that there is no repo-id, but in 
general some approximation of it tends to be better than nothing.)  This should 
make it painfully obvious what data one is dealing with and how to 
parse/interpret it just by looking at the byte-stream without any other context.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2945

To: pulkit, #hg-reviewers
Cc: jeffpc, yuja, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2945: state: add a magicheader to each state file

2018-06-14 Thread pulkit (Pulkit Goyal)
pulkit added a subscriber: jeffpc.
pulkit added a comment.


  There is a suggestion from @jeffpc to have a plain text header in statefiles 
like `HGStateFile` so that user can grep for them or look into files and know 
that they are state files. How does that sound?
  (@jeffpc I am not sure I explained this correctly, it will be great if you 
can chime in.)

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2945

To: pulkit, #hg-reviewers
Cc: jeffpc, yuja, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2945: state: add a magicheader to each state file

2018-03-28 Thread yuja (Yuya Nishihara)
yuja added a comment.


  > There are some old state files, which don't have a version header on top
  >  of them like graftstate. I need suggestion on how to handle them.
  
  In that case, we would have to either do some heuristic (like histedit), or 
maintain
  old/new state files (like mergestate.) Perhaps graft can do the latter?

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2945

To: pulkit, #hg-reviewers
Cc: yuja, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2945: state: add a magicheader to each state file

2018-03-28 Thread pulkit (Pulkit Goyal)
pulkit added a comment.


  In https://phab.mercurial-scm.org/D2945#47831, @yuja wrote:
  
  > I think the magic has to vary depending on the current state file format.
  >  The state file must be backward/forward compatible with the older/newer 
formats.
  >
  > If the current magic is `2\n` (and is parsed as `int(f.readline())`) for 
example, the
  >  CBOR preamble would have to be `3\n`. Otherwise, old hg client cant 
determine
  >  whether the state file is corrupted or written by newer client.
  
  
  There are some old state files, which don't have a version header on top of 
them like graftstate. I need suggestion on how to handle them.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2945

To: pulkit, #hg-reviewers
Cc: yuja, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2945: state: add a magicheader to each state file

2018-03-27 Thread yuja (Yuya Nishihara)
yuja added a comment.


  I think the magic has to vary depending on the current state file format.
  The state file must be backward/forward compatible with the older/newer 
formats.
  
  If the current magic is `2\n` (and is parsed as `int(f.readline())`) for 
example, the
  CBOR preamble would have to be `3\n`. Otherwise, old hg client cant determine
  whether the state file is corrupted or written by newer client.
  
  FWIW, last time I reviewed `simplekeyvaluefile`, I insisted that a parser 
should
  take a file stream instead of a filename, so the caller can handle 
compatibility thingy.
  
with open("statefile", "rb") as fp:
try:
version = int(fp.readline())
except ValueError:
raise some nicer exception
if version == 2:
return readold(fp)
elif version == 3:
return readcbor(fp)

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2945

To: pulkit, #hg-reviewers
Cc: yuja, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D2945: state: add a magicheader to each state file

2018-03-26 Thread pulkit (Pulkit Goyal)
pulkit created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  This patch adds a magic header to each state file written using CBOR format so
  that we can distinguish between old state files and new one even if old state
  files get successfully parsed by cbor.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D2945

AFFECTED FILES
  mercurial/state.py

CHANGE DETAILS

diff --git a/mercurial/state.py b/mercurial/state.py
--- a/mercurial/state.py
+++ b/mercurial/state.py
@@ -51,6 +51,7 @@
 self.opts = {}
 else:
 self.opts = opts
+self.opts['magicheader'] = 'HGStateFile'
 
 def __nonzero__(self):
 return self.exists()



To: pulkit, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel