On Wednesday 27 May 2009 15:57:00 Eric Kow wrote:
> Reinier: could you read Gorsvet.hs itself in a more thorough manner?

Yes, of course. General remarks:
 * Though I complain a bit below about missing documentation, the haddocks of   
   
   hashed-storage are very helpful when present.
 * No glaring bugs found, but quite a lot of questions. When Petr answers 
   them, it's OK to apply the bundle.

First thing I observe about Gorsvet.hs is that it gives an awful lot of 
warnings about missing type signatures and other things.

<imports skipped>. Darcs.Gorsvet does not explicitly declare what it exports. 
I'd rather see it did, but that may be a personal preference. What do you 
thnk?

>floatFn = floatPath . fn2fp

What floatPath does is not so well documented on hackage, especially because 
the AnchoredPath type has no explanation.

>instance ReadableDirectory (StateT TreeState IO) where
>    mDoesDirectoryExist d = gets (\x -> isJust $ findTree (tree x) (floatFn
> d)) mDoesFileExist f = gets (\x -> isJust $ findFile (tree x) (floatFn f))
> mInCurrentDirectory d action = do -- TODO bracket?
>      wd <- gets cwd
>      modify (\x -> x { cwd = floatFn d })
>      x <- action
>      modify (\x -> x { cwd = wd })
>      return x
>    mGetDirectoryContents = error "get dir contents"
>    mReadFilePS p = do x <- readFile (floatFn p) -- ratify readFile: ...
>                       return $ BS.concat (BL.toChunks x)

Wouldn't it make the code more readable to use a type synonym for "StateT 
TreeState IO"? The haddock for hashed-storage mentions "TreeIO" without a link 
sometimes. Is that a type synonym for StateT TreeState IO?

BTW, what is a TreeState? It's not documented anywhere. It appears to be a 
Tree with a current working directory.

>instance WriteableDirectory (StateT TreeState IO) where
>    mWithCurrentDirectory = mInCurrentDirectory
>    mSetFileExecutable _ _ = return ()
>    mWriteFilePS p ps = writeFile -- ratify readFile: haskell_policy is
> stupid. (floatFn p) (BL.fromChunks [ps])
>    mCreateDirectory p = createDirectory (floatFn p)
>    mRename from to = rename (floatFn from) (floatFn to)
>    mRemoveDirectory = unlink . floatFn
>    mRemoveFile = unlink . floatFn

Is it OK to have mSetFileExecutable be a no-op? Or is this something you plan 
to change in the future? Also, I'd love to see haskell_policy cursed in Czech 
for a change :-). For the interested reader: all the functions with the system 
call-like names (unlink, createDirectory, readFile) are not Haskell library 
functions but functions exported by hashed-storage.

>treeDiff :: (FilePath -> FileType) -> Tree -> Tree -> IO (FL Prim)
>treeDiff ft t1 t2 = do
>  (from, to) <- diffTrees t1 t2
>  diffs <- sequence $ zipTrees diff from to
>  return $ foldr (+>+) NilFL (diffs)

I suppose the parentheses around diffs can be removed?

>    where diff :: AnchoredPath -> Maybe TreeItem -> Maybe TreeItem
>               -> IO (FL Prim)
>          diff p (Just (SubTree _)) (Just (SubTree _)) = return NilFL
>          diff p (Just (SubTree _)) Nothing =
>              return $ rmdir (anchorPath "" p) :>: NilFL
>          diff p Nothing (Just (SubTree _)) =
>              return $ adddir (anchorPath "" p) :>: NilFL
>          diff p Nothing b'@(Just (File _)) =
>              do diff <- diff p (Just (File emptyBlob)) b'
>                 return $ addfile (anchorPath "" p) :>: diff
>          diff p a'@(Just (File _)) Nothing =
>              do diff <- diff p a' (Just (File emptyBlob))
>                 return $ diff +>+ (rmfile (anchorPath "" p) :>: NilFL)
>          diff p (Just (File a')) (Just (File b')) =
>              do a <- read a'
>                 b <- read b'
>                 let path = anchorPath "" p
>                 case ft path of
>                   TextFile | no_bin a && no_bin b ->
>                                return $ text_diff path a b
>                   _ -> return $ if a /= b
>                                    then binary path (strict a) (strict b)
> :>: NilFL else NilFL
>          diff p _ _ = fail $ "Missing case at path " ++ show p

How about the case of a file in the one tree and a directory in the other 
tree, or reverse?

>          text_diff p a b
>              | BL.null a && BL.null b = NilFL
>              | BL.null a = diff_from_empty p b
>              | BL.null b = diff_to_empty p a
>              | otherwise = line_diff p (lines a) (lines b)
>
>          line_diff p a b = canonize (hunk p 1 a b)
>          diff_to_empty p x | BL.last x == '\n' = line_diff p (init $ lines
> x) []
>
>                            | otherwise = line_diff p (lines x) [BS.empty]
>
>          diff_from_empty p x = invert (diff_to_empty p x)
>          no_bin = not . is_funky . strict . BL.take 4096
>          lines = map strict . BL.split '\n'
>          strict = BS.concat . BL.toChunks

That is one long function. Perhaps split the 'diff' and 'text_diff' functions 
into separate top-level ones?

>readRecordedAndPending :: (RepoPatch p) => Repository p -> IO Tree
>readRecordedAndPending repo = do
>  pristine <- readDarcsPristine "."
>  Sealed pend <- read_pending repo
>  (_, t) <- virtualTreeIO (apply [] pend) pristine
>  return t

OK.

>unrecordedChanges :: (RepoPatch p) => [DarcsFlag] -> Repository p
>                  -> (Tree -> Tree) -> IO (FL Prim)
>unrecordedChanges opts repo restrict_ = do
>  checkIndex repo
>  slurp_pending repo -- XXX: only here to get us the "pending conflicts"
> check -- that I don't know yet how to implement properly pristine <-
> readDarcsPristine "."
>  Sealed pending_patches <- read_pending repo
>  (res, current') <- virtualTreeIO (apply [] pending_patches) pristine
>  let current = {- restrict -} current'
>
>  working <- case (LookForAdds `elem` opts, IgnoreTimes `elem` opts) of
>               (False, False) -> (restrict_ `fmap` readIndex) >>= unfold
>               (False, True) -> do guide <- unfold current
>                                   restrict guide `fmap` readPlainTree "."
>               (True, _) -> filter nodarcs `fmap` readPlainTree "."
>  ft <- filetype_function
>  diff <- treeDiff ft current working
>  return $ sort_coalesceFL (pending_patches +>+ diff)
>      where nodarcs (AnchoredPath (Name x:_)) _ | x == BS.pack "_darcs" =
> False nodarcs _ _ = True

What's the "restrict" business in this function? It looks OK further. 
Shouldn't the nodarcs function take the user's boring file into account 
eventually?

>-- XXX both application actions below could avoid unfolding if TreeIO would
> be -- smart enough to unfold-as-needed...
>applyToTentativePristine _ patches =
>    do pristine <- readDarcsPristine "." >>= unfold
>       (_, tree) <- hashedTreeIO (apply [] patches)
>                    pristine "_darcs/pristine.hashed"
>       BS.writeFile "_darcs/tentative_pristine" $
>         BS.concat [BS.pack "pristine:"
>                   , darcsFormatHash (fromJust $ treeHash tree)]

OK.

>applyToWorking :: (RepoPatch p) => Repository p -> Sealed (FL Prim) -> IO
> Tree applyToWorking _ (Sealed patches) =
>    do pristine <- readDarcsPristine "." >>= unfold
>       working <- readIndex
>       snd `fmap` plainTreeIO (apply [] patches) working "."

Why is it reading that pristine there? Just to ensure it is in a darcs repo?

>tentativelyMerge r cmd usi themi =

A type signature would have been helpful here.

>  do let us = mapFL_FL hopefully usi
>         them = mapFL_FL hopefully themi
>         (_ :/\: pc) = merge (progressFL "Merging them" them
>
>                                             :\/: progressFL "Merging us"
>                                             : us)
>
>     pend <- unrecordedChanges [] r id
>     anonpend <- anonymous (fromPrims pend)
>     let pend' :/\: pw = merge (pc :\/: anonpend :>: NilFL)
>         pwprim = joinPatches $ mapFL_FL patchcontents pw
>         Sealed standard_resolved_pw = standard_resolution pwprim
>     mapM_ backupByCopying $ list_touched_files standard_resolved_pw
>     have_conflicts <- announce_merge_conflicts cmd [] standard_resolved_pw
>     have_unrecorded_conflicts <- check_unrecorded_conflicts [] pc
>     let Sealed pw_resolution = if have_conflicts ||
> have_unrecorded_conflicts then seal NilFL
>                                   else seal standard_resolved_pw
>     let doChanges :: FL (PatchInfoAnd p) -> IO ()
>         doChanges NilFL = applyps r themi
>         doChanges _     = applyps r (mapFL_FL n2pia pc)
>     doChanges usi
>     setTentativePending r (effect pend' +>+ pw_resolution)
>     return $ seal (effect pwprim +>+ pw_resolution)
>  where mapAdd :: RepoPatch p => Repository p -> FL (PatchInfoAnd p) -> [IO
> ()] mapAdd _ NilFL = []
>        mapAdd r' (a:>:as) =
>               (add_to_tentative_inventory (extractCache r') NoCompression a
> >> return ()) : mapAdd r' as applyps :: (RepoPatch p) => Repository p -> FL
> (PatchInfoAnd p) -> IO () applyps repo ps = do sequence_ $ mapAdd repo ps
>                             applyToTentativePristine repo ps

The code in this function looks quite David-ish. Did you adapt it from 
elsewhere? I have to say I don't understand it. First it merges the changes 
from another source, then it tries to find out if there were conflicts and in 
the end it applies the original merge to the tentative pristine. It does not 
look particularly dangerous, but I can't tell you why that seal is there 
around pw_resolution, for example.

>filter_paths files =
>    \p _ -> any (\x -> x `isPrefix` p || p `isPrefix` x) files
>
>restrict_paths files = if null files
>                          then id
>                          else filter $ filter_paths files
>
>restrict_subpaths = restrict_paths . map (floatPath . fn2fp . sp2fn)

This goes from a list of subpaths and a list of anchored pahs to the list of 
those anchored paths that are prefixes or suffixes of any files in the first 
list of paths. Type signatures and docs are forthcoming, I hope?

>checkIndex repo = do
>  invalid <- doesFileExist "_darcs/index_invalid"
>  exist <- doesFileExist "_darcs/index"
>  when (not exist || invalid) $ updateIndex repo
>  when invalid $ removeFile "_darcs/index_invalid"

OK. Have you thought about races where two darcs processes would be trying to 
rebuild the index at the same time?

>updateIndex repo = do
>  pristine <- readRecordedAndPending repo
>  updateIndexFrom pristine >>= unfold
>  return ()

What determines what index updateIndexFro modifies? The current working 
directory of the process?

>invalidateIndex _ = do
>  BS.writeFile "_darcs/index_invalid" BS.empty

OK.

Regards,
Reinier

Attachment: signature.asc
Description: This is a digitally signed message part.

_______________________________________________
darcs-users mailing list
[email protected]
http://lists.osuosl.org/mailman/listinfo/darcs-users

Reply via email to