[x265] [PATCH] asm: correct improper macro

2014-02-20 Thread dnyaneshwar
# HG changeset patch
# User Dnyaneshwar G dnyanesh...@multicorewareinc.com
# Date 1392885326 -19800
#  Thu Feb 20 14:05:26 2014 +0530
# Node ID 7e1d61e583b8c28280fe79bc29e2f4a66579d061
# Parent  3389061b75a486e004409ab628c46fed39d03b72
asm: correct improper macro

diff -r 3389061b75a4 -r 7e1d61e583b8 source/common/x86/dct8.asm
--- a/source/common/x86/dct8.asmWed Feb 19 17:03:21 2014 -0600
+++ b/source/common/x86/dct8.asmThu Feb 20 14:05:26 2014 +0530
@@ -362,10 +362,10 @@
 INIT_XMM sse2
 cglobal idst4, 3, 4, 7
 %if BIT_DEPTH == 8
-  %define m6  [pd_2048]
+  mova m6, [pd_2048]
   %define IDCT4_SHIFT 12
 %elif BIT_DEPTH == 10
-  %define m6  [pd_512]
+  mova m6, [pd_512]
   %define IDCT4_SHIFT 10
 %else
   %error Unsupported BIT_DEPTH!
___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel


Re: [x265] APPCRASH in x265 0.7+207 while encoding in preset 'slow' or slower...

2014-02-20 Thread Mario *LigH* Rohkrämer

Am 20.02.2014, 00:19 Uhr, schrieb Steve Borho st...@borho.org:


But quality in default CRF 28 is now a lot worse, files now even about
half the size as before, in presets {fast..placebo}.

--preset faster: 544.26 kbps, 20.311 dB SSIM
--preset fast: 56.79 kbps, 13.542 dB SSIM
--preset slow: 51.78 kbps, 13.493 db SSIM

(Sintel trailer, 640x272, no additional options except logging)



will look into this next, thanks for reporting.



My fault, a fix for this was just pushed.


OK, so with v0.7+222, we are back to ~1:5 file size and ~5 dB SSIM  
difference. This is a different topic then...


--
__

Fun and success!
Mario *LigH* Rohkrämer
mailto:cont...@ligh.de

___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel


[x265] cleanup useless getMvPred*(): always zero mv

2014-02-20 Thread Satoshi Nakagawa
# HG changeset patch
# User Satoshi Nakagawa nakagawa...@oki.com
# Date 1392889699 -32400
#  Thu Feb 20 18:48:19 2014 +0900
# Node ID 5d27f9feb54079cb63937c342513f2428670b0f5
# Parent  3389061b75a486e004409ab628c46fed39d03b72
cleanup useless getMvPred*(): always zero mv

diff -r 3389061b75a4 -r 5d27f9feb540 source/Lib/TLibCommon/TComDataCU.h
--- a/source/Lib/TLibCommon/TComDataCU.hWed Feb 19 17:03:21 2014 -0600
+++ b/source/Lib/TLibCommon/TComDataCU.hThu Feb 20 18:48:19 2014 +0900
@@ -135,9 +135,6 @@
 TComDataCU*   m_cuAbove; /// pointer of above CU
 TComDataCU*   m_cuLeft;  /// pointer of left CU
 TComDataCU*   m_cuColocated[2];  /// pointer of temporally colocated CU's 
for both directions
-TComMvField   m_mvFieldA;/// motion vector of position A
-TComMvField   m_mvFieldB;/// motion vector of position B
-TComMvField   m_mvFieldC;/// motion vector of position C
 
 // 
---
 // coding tool information
@@ -388,12 +385,6 @@
 
 void  clipMv(MV outMV);
 
-void  getMvPredLeft(MV mvPred)   { mvPred = m_mvFieldA.mv; }
-
-void  getMvPredAbove(MV mvPred)  { mvPred = m_mvFieldB.mv; }
-
-void  getMvPredAboveRight(MV mvPred) { mvPred = m_mvFieldC.mv; }
-
 // 
---
 // utility functions for neighboring information
 // 
---
diff -r 3389061b75a4 -r 5d27f9feb540 source/Lib/TLibEncoder/TEncSearch.cpp
--- a/source/Lib/TLibEncoder/TEncSearch.cpp Wed Feb 19 17:03:21 2014 -0600
+++ b/source/Lib/TLibEncoder/TEncSearch.cpp Thu Feb 20 18:48:19 2014 +0900
@@ -2613,10 +2613,6 @@
 Pel* pu = fenc-getLumaAddr(cu-getAddr(), cu-getZorderIdxInCU() + 
partAddr);
 m_me.setSourcePU(pu - fenc-getLumaAddr(), roiWidth, roiHeight);
 
-cu-getMvPredLeft(m_mvPredictors[0]);
-cu-getMvPredAbove(m_mvPredictors[1]);
-cu-getMvPredAboveRight(m_mvPredictors[2]);
-
 bool bTestNormalMC = true;
 
 if (bUseMRG  cu-getWidth(0)  8  numPart == 2)
@@ -2648,7 +2644,7 @@
 MV mvmin, mvmax;
 xSetSearchRange(cu, mvp, merange, mvmin, mvmax);
 int satdCost = m_me.motionEstimate(m_mref[list][idx],
-   mvmin, mvmax, mvp, 3, 
m_mvPredictors, merange, outmv);
+   mvmin, mvmax, mvp, 0, 
m_mvPredictors, merange, outmv);
 
 /* Get total cost of partition, but only include MV bit 
cost once */
 bitsTemp += m_me.bitcost(outmv);
___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel


[x265] [PATCH 0 of 6 ] cu level vbv ratecontrol

2014-02-20 Thread aarthi

___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel


[x265] [PATCH 4 of 6] vbv: fix bugs in vbv flow with single pass ABR

2014-02-20 Thread aarthi
# HG changeset patch
# User Aarthi Thirumalai
# Date 139283 -19800
#  Thu Feb 20 18:09:53 2014 +0530
# Node ID c5c07a3ee7fcf0f331f06f83b7b3bc0b1bcc1668
# Parent  ebc23ec5ac1c5f8e2abe0d0b88414fabd2fdf1bb
vbv: fix bugs in vbv flow with single pass ABR

diff -r ebc23ec5ac1c -r c5c07a3ee7fc source/encoder/encoder.cpp
--- a/source/encoder/encoder.cppThu Feb 20 17:47:53 2014 +0530
+++ b/source/encoder/encoder.cppThu Feb 20 18:09:53 2014 +0530
@@ -367,7 +367,7 @@
 // Allow this frame to be recycled if no frame encoders are using it 
for reference
 ATOMIC_DEC(out-m_countRefEncoders);
 
-m_rateControl-rateControlEnd(bits, (curEncoder-m_rce));
+m_rateControl-rateControlEnd(out, bits, (curEncoder-m_rce));
 
 m_dpb-recycleUnreferenced(m_freeList);
 
diff -r ebc23ec5ac1c -r c5c07a3ee7fc source/encoder/ratecontrol.cpp
--- a/source/encoder/ratecontrol.cppThu Feb 20 17:47:53 2014 +0530
+++ b/source/encoder/ratecontrol.cppThu Feb 20 18:09:53 2014 +0530
@@ -207,7 +207,9 @@
 RateControl::RateControl(TEncCfg * _cfg)
 {
 this-cfg = _cfg;
-ncu = (int)((cfg-param.sourceHeight * cfg-param.sourceWidth) / 
pow((int)16, 2.0));
+int lowresCuWidth = ((cfg-param.sourceWidth/2) + X265_LOWRES_CU_SIZE - 1) 
 X265_LOWRES_CU_BITS;
+int lowresCuHeight = ((cfg-param.sourceHeight/2)  + X265_LOWRES_CU_SIZE - 
1)  X265_LOWRES_CU_BITS;
+ncu = lowresCuWidth * lowresCuHeight;
 
 if (cfg-param.rc.cuTree)
 {
@@ -445,12 +447,12 @@
  * average QP of the two adjacent P-frames + an offset */
 TComSlice* prevRefSlice = curSlice-getRefPic(REF_PIC_LIST_0, 
0)-getSlice();
 TComSlice* nextRefSlice = curSlice-getRefPic(REF_PIC_LIST_1, 
0)-getSlice();
+double q0 = curSlice-getRefPic(REF_PIC_LIST_0, 0)-m_avgQpRc;
+double q1 = curSlice-getRefPic(REF_PIC_LIST_1, 0)-m_avgQpRc;
 bool i0 = prevRefSlice-getSliceType() == I_SLICE;
 bool i1 = nextRefSlice-getSliceType() == I_SLICE;
 int dt0 = abs(curSlice-getPOC() - prevRefSlice-getPOC());
 int dt1 = abs(curSlice-getPOC() - nextRefSlice-getPOC());
-double q0 = prevRefSlice-m_avgQpRc;
-double q1 = nextRefSlice-m_avgQpRc;
 
 // Skip taking a reference frame before the Scenecut if ABR has been 
reset.
 if (lastAbrResetPoc = 0  !isVbv)
@@ -522,7 +524,7 @@
 {
 if (!isVbv)
 {
-checkAndResetABR(rce);
+checkAndResetABR(pic, rce);
 }
 q = getQScale(rce, wantedBitsWindow / cplxrSum);
 
@@ -552,23 +554,22 @@
 if (cfg-param.rc.rateControlMode != X265_RC_CRF)
 {
 double lqmin = 0, lqmax = 0;
-if (totalBits == 0)
+if (totalBits == 0  !isVbv)
 {
 lqmin = qp2qScale(ABR_INIT_QP_MIN) / lstep;
 lqmax = qp2qScale(ABR_INIT_QP_MAX) * lstep;
+q = Clip3(lqmin, lqmax, q);
 }
-else
+else if(totalBits  0 || (isVbv  rce-poc  0 ))
 {
 lqmin = lastQScaleFor[sliceType] / lstep;
 lqmax = lastQScaleFor[sliceType] * lstep;
+if (overflow  1.1  framesDone  3)
+lqmax *= lstep;
+else if (overflow  0.9)
+lqmin /= lstep;
+q = Clip3(lqmin, lqmax, q);
 }
-
-if (overflow  1.1  framesDone  3)
-lqmax *= lstep;
-else if (overflow  0.9)
-lqmin /= lstep;
-
-q = Clip3(lqmin, lqmax, q);
 }
 else
 {
@@ -593,7 +594,7 @@
 }
 }
 
-void RateControl::checkAndResetABR(RateControlEntry* rce)
+void RateControl::checkAndResetABR(TComPic* pic, RateControlEntry* rce)
 {
 double abrBuffer = 2 * cfg-param.rc.rateTolerance * bitrate;
 
@@ -604,7 +605,7 @@
 {
 // Reset ABR if prev frames are blank to prevent further sudden 
overflows/ high bit rate spikes.
 double underflow = 1.0 + (totalBits - wantedBitsWindow) / 
abrBuffer;
-if (underflow  1  curSlice-m_avgQpRc == 0)
+if (underflow  1  pic-m_avgQpRc == 0)
 {
 totalBits = 0;
 framesDone = 0;
@@ -659,9 +660,9 @@
 double bufferFillCur = bufferFill - curBits;
 double targetFill;
 double totalDuration = 0;
-frameQ[0] = sliceType == I_SLICE ? q * cfg-param.rc.ipFactor 
: q;
-frameQ[1] = frameQ[0] * cfg-param.rc.pbFactor;
-frameQ[2] = frameQ[0] / cfg-param.rc.ipFactor;
+frameQ[P_SLICE] = sliceType == I_SLICE ? q * 
cfg-param.rc.ipFactor : q;
+frameQ[B_SLICE] = frameQ[P_SLICE] * cfg-param.rc.pbFactor;
+frameQ[I_SLICE] = frameQ[P_SLICE] / cfg-param.rc.ipFactor;
 
 /* Loop over the planned future frames. */
 

[x265] [PATCH 5 of 6] vbv: implement row wise vbvRateControl at each row diagonal

2014-02-20 Thread aarthi
# HG changeset patch
# User Aarthi Thirumalai
# Date 1392900828 -19800
#  Thu Feb 20 18:23:48 2014 +0530
# Node ID 49b90667f050a7dd9c28b5017f389f5c29a1c191
# Parent  c5c07a3ee7fcf0f331f06f83b7b3bc0b1bcc1668
vbv: implement row wise vbvRateControl at each row diagonal.

diff -r c5c07a3ee7fc -r 49b90667f050 source/encoder/ratecontrol.cpp
--- a/source/encoder/ratecontrol.cppThu Feb 20 18:09:53 2014 +0530
+++ b/source/encoder/ratecontrol.cppThu Feb 20 18:23:48 2014 +0530
@@ -221,6 +221,10 @@
 
 // validate for cfg-param.rc, maybe it is need to add a function like 
x265_parameters_valiate()
 cfg-param.rc.rfConstant = Clip3((double)-QP_BD_OFFSET, (double)51, 
cfg-param.rc.rfConstant);
+cfg-param.rc.rfConstantMax = Clip3((double)-QP_BD_OFFSET, (double)51, 
cfg-param.rc.rfConstantMax);
+rateFactorMaxIncrement = 0;
+vbvMinRate = 0;
+
 if (cfg-param.rc.rateControlMode == X265_RC_CRF)
 {
 cfg-param.rc.qp = (int)cfg-param.rc.rfConstant + QP_BD_OFFSET;
@@ -230,6 +234,15 @@
 double mbtree_offset = cfg-param.rc.cuTree ? (1.0 - 
cfg-param.rc.qCompress) * 13.5 : 0;
 rateFactorConstant = pow(baseCplx, 1 - qCompress) /
 qp2qScale(cfg-param.rc.rfConstant + mbtree_offset + QP_BD_OFFSET);
+if (cfg-param.rc.rfConstantMax)
+{
+rateFactorMaxIncrement = cfg-param.rc.rfConstantMax - 
cfg-param.rc.rfConstant;
+if (rateFactorMaxIncrement = 0)
+{
+x265_log(cfg-param, X265_LOG_WARNING, CRF max must be 
greater than CRF\n);
+rateFactorMaxIncrement = 0;
+}
+}
 }
 
 isAbr = cfg-param.rc.rateControlMode != X265_RC_CQP; // later add 2pass 
option
@@ -760,6 +773,181 @@
 return Clip3(lmin1, lmax1, q);
 }
 
+ double RateControl::predictRowsSizeSum(TComPic* pic, double qpVbv, int32_t  
encodedBitsSoFar)
+{
+uint32_t rowSatdCostSoFar = 0 ,totalSatdBits = 0;
+encodedBitsSoFar = 0;
+double qScale = qp2qScale(qpVbv);
+int sliceType = pic-getSlice()-getSliceType();
+TComPic* refPic = pic-getSlice()-getRefPic(REF_PIC_LIST_0, 0);
+int maxRows = pic-getPicSym()-getFrameHeightInCU();
+for (int row = 0 ; row  maxRows; row++)
+{
+encodedBitsSoFar += pic-m_rowEncodedBits[row];
+rowSatdCostSoFar = pic-m_rowDiagSatd[row];
+uint32_t satdCostForPendingCus = pic-m_rowSatdForVbv[row] - 
rowSatdCostSoFar;
+if (satdCostForPendingCus   0)
+{
+double pred_s = predictSize(rowPred[0], qScale, 
satdCostForPendingCus);
+uint32_t refRowSatdCost= 0 , refRowBits = 0;
+double refQScale=0;
+
+if (sliceType != I_SLICE)
+{
+for (uint32_t cuAddr = pic-m_numEncodedCusPerRow[row] + 1; 
cuAddr  pic-getPicSym()-getFrameWidthInCU() * (row + 1); cuAddr++)
+{
+refRowSatdCost += refPic-m_cuCostsForVbv[cuAddr];
+refRowBits = refPic-getCU(cuAddr)-m_totalBits;
+}
+refQScale = row == maxRows - 1 ? refPic-m_rowDiagQScale[row] 
: refPic-m_rowDiagQScale[row + 1];
+}
+
+if (sliceType == I_SLICE || qScale = refQScale)
+{
+if (sliceType == P_SLICE
+ refPic-getSlice()-getSliceType() == sliceType
+ refQScale  0
+ refRowSatdCost  0)
+{
+if (abs(int32_t(refRowSatdCost - satdCostForPendingCus))  
(int32_t)satdCostForPendingCus / 2)
+{
+double pred_t = refRowBits * satdCostForPendingCus / 
refRowSatdCost
+* refQScale / qScale;
+totalSatdBits += int32_t((pred_s + pred_t) * 0.5);
+}
+}
+
+else
+totalSatdBits += int32_t(pred_s);
+}
+else
+{
+ /* Our QP is lower than the reference! */
+double pred_intra = predictSize(rowPred[1], qScale, 
refRowSatdCost);
+/* Sum: better to overestimate than underestimate by using 
only one of the two predictors. */
+totalSatdBits += int32_t(pred_intra + pred_s);
+}
+}
+}
+return totalSatdBits + encodedBitsSoFar;
+ }
+
+int RateControl::rowDiagonalVbvRateControl(TComPic* pic, uint32_t row, 
RateControlEntry* rce, double qpVbv)
+{
+fprintf(fp,\n poc :%d , type : %d , slice Qp : %d , row : %d , startvbv : 
%f , pic-getPOC(), rce-sliceType, pic-getSlice()-getSliceQp(), row, qpVbv);
+if (rce-poc == 24)
+{
+int i=0; i++;
+}
+double qScaleVbv = qp2qScale(qpVbv);
+pic-m_rowDiagQp[row] = qpVbv;
+pic-m_rowDiagQScale[row] = qScaleVbv;
+//TODO  : check whther we gotto update prodictor using whole Row Satd or 
only satd of blocks upto the diagonal in row.
+

[x265] [PATCH 6 of 6] vbv: integrate row level vbv ratecontrol at each major row diagonal

2014-02-20 Thread aarthi
# HG changeset patch
# User Aarthi Thirumalai
# Date 1392901254 -19800
#  Thu Feb 20 18:30:54 2014 +0530
# Node ID 650d5f835e417f45bd8a9f86465ca1909eaa9526
# Parent  49b90667f050a7dd9c28b5017f389f5c29a1c191
vbv: integrate row level vbv ratecontrol at each major row diagonal.

diff -r 49b90667f050 -r 650d5f835e41 source/encoder/frameencoder.cpp
--- a/source/encoder/frameencoder.cpp   Thu Feb 20 18:23:48 2014 +0530
+++ b/source/encoder/frameencoder.cpp   Thu Feb 20 18:30:54 2014 +0530
@@ -1057,6 +1057,7 @@
 CTURow codeRow = m_rows[m_cfg-param.bEnableWavefront ? row : 0];
 const uint32_t numCols = m_pic-getPicSym()-getFrameWidthInCU();
 const uint32_t lineStartCUAddr = row * numCols;
+double qpBase = m_pic-m_avgQpRc;
 for (uint32_t col = curRow.m_completed; col  numCols; col++)
 {
 const uint32_t cuAddr = lineStartCUAddr + col;
@@ -1067,24 +1068,42 @@
 codeRow.m_entropyCoder.resetEntropy();
 
 TEncSbac *bufSbac = (m_cfg-param.bEnableWavefront  col == 0  row 
 0) ? m_rows[row - 1].m_bufferSbacCoder : NULL;
-if (m_cfg-param.rc.aqMode)
+
+if ((uint32_t)row = col  (row !=0))
+qpBase = m_pic-getCU(cuAddr - numCols + 1)-m_baseQp;
+
+if (m_cfg-param.rc.aqMode || (m_cfg-param.rc.vbvBufferSize 0  
m_cfg-param.rc.vbvMaxBitrate 0))
 {
-int qp = calcQpForCu(m_pic, cuAddr);
+int qp = calcQpForCu(m_pic, cuAddr , qpBase);
 setLambda(qp, row);
-if (qp  MAX_QP)
-qp = MAX_QP;
-cu-setQP(0, (char)qp);
+qp = X265_MIN(qp, MAX_QP);
+cu-setQP(0,char(qp));
+cu-m_baseQp = qpBase;
 }
 codeRow.processCU(cu, m_pic-getSlice(), bufSbac, 
m_cfg-param.bEnableWavefront  col == 1);
 
-// TODO: Keep atomic running totals for rate control?
-// cu-m_totalBits;
-// cu-m_totalCost;
-// cu-m_totalDistortion;
+if (m_cfg-param.rc.vbvBufferSize  m_cfg-param.rc.vbvMaxBitrate)
+{
+// Update encoded bits, satdCost, baseQP for each CU
+m_pic-m_rowDiagSatd[row] += m_pic-m_cuCostsForVbv[cuAddr];
+m_pic-m_rowEncodedBits[row] += cu-m_totalBits;
+m_pic-m_numEncodedCusPerRow[row] = cuAddr;
+m_pic-m_qpaAq[row] += cu-getQP(0);
+m_pic-m_qpaRc[row] += cu-m_baseQp;
+
+if ((uint32_t)row == col)
+m_pic-m_rowDiagQp[row] = qpBase;
+
+// If current block is at row diagonal checkpoint, call vbv 
ratecontrol.
+if ((uint32_t)row == col  row != 0 )
+{
+ m_top-m_rateControl-rowDiagonalVbvRateControl(m_pic, row, 
m_rce, qpBase);
+ qpBase = Clip3((double)MIN_QP, (double)MAX_MAX_QP, qpBase);
+}
+}
 
 // Completed CU processing
 m_rows[row].m_completed++;
-
 if (m_rows[row].m_completed = 2  row  m_numRows - 1)
 {
 ScopedLock below(m_rows[row + 1].m_lock);
@@ -1128,34 +1147,43 @@
 curRow.m_busy = false;
 }
 
-int FrameEncoder::calcQpForCu(TComPic *pic, uint32_t cuAddr)
+int FrameEncoder::calcQpForCu(TComPic *pic, uint32_t cuAddr, double baseQp)
 {
 x265_emms();
-double qp = pic-getSlice()-m_avgQpRc;
-if (m_cfg-param.rc.aqMode)
+double qp = baseQp;
+
+/* Derive qpOffet for each CU by averaging offsets for all 16x16 blocks in 
the cu. */
+double qp_offset = 0;
+int maxBlockCols = (pic-getPicYuvOrg()-getWidth() + (16 - 1)) / 16;
+int maxBlockRows = (pic-getPicYuvOrg()-getHeight() + (16 - 1)) / 16;
+int noOfBlocks = g_maxCUWidth / 16;
+int block_y = (cuAddr / pic-getPicSym()-getFrameWidthInCU()) * 
noOfBlocks;
+int block_x = (cuAddr * noOfBlocks) - block_y * 
pic-getPicSym()-getFrameWidthInCU();
+
+double *qpoffs = (pic-getSlice()-isReferenced()  
m_cfg-param.rc.cuTree) ? pic-m_lowres.qpOffset : pic-m_lowres.qpAqOffset;
+int cnt = 0, idx =0;
+for (int h = 0; h  noOfBlocks  block_y  maxBlockRows; h++, block_y++)
 {
-/* Derive qpOffet for each CU by averaging offsets for all 16x16 
blocks in the cu. */
-double qp_offset = 0;
-int maxBlockCols = (pic-getPicYuvOrg()-getWidth() + (16 - 1)) / 16;
-int maxBlockRows = (pic-getPicYuvOrg()-getHeight() + (16 - 1)) / 16;
-int noOfBlocks = g_maxCUWidth / 16;
-int block_y = (cuAddr / pic-getPicSym()-getFrameWidthInCU()) * 
noOfBlocks;
-int block_x = (cuAddr * noOfBlocks) - block_y * 
pic-getPicSym()-getFrameWidthInCU();
+for (int w = 0; w  noOfBlocks  (block_x + w)  maxBlockCols; w++)
+{
+idx = block_x + w + (block_y * maxBlockCols);
+if (m_cfg-param.rc.aqMode)
+qp_offset += qpoffs[idx];
 
-double *qpoffs = (pic-getSlice()-isReferenced()  
m_cfg-param.rc.cuTree) ? pic-m_lowres.qpOffset : pic-m_lowres.qpAqOffset;
-int cnt = 0;
-for (int h = 0; h  noOfBlocks  

Re: [x265] [PATCH] asm: modified the range of scale value in dequant

2014-02-20 Thread chen
right now

At 2014-02-20 14:12:47,muru...@multicorewareinc.com wrote:
# HG changeset patch
# User Murugan Vairavel muru...@multicorewareinc.com
# Date 1392876751 -19800
#  Thu Feb 20 11:42:31 2014 +0530
# Node ID 96e64ac56117b13b1f1ff098e1c3e6f28b3bf3f4
# Parent  3389061b75a486e004409ab628c46fed39d03b72
asm: modified the range of scale value in dequant

diff -r 3389061b75a4 -r 96e64ac56117 source/common/x86/pixel-util8.asm
--- a/source/common/x86/pixel-util8.asm Wed Feb 19 17:03:21 2014 -0600
+++ b/source/common/x86/pixel-util8.asm Thu Feb 20 11:42:31 2014 +0530
@@ -1153,8 +1153,9 @@
 INIT_XMM sse4
 cglobal dequant_normal, 4,5,5
 movdm1, r3 ; m1 = word [scale]
-cmp r3d, 255
+cmp r3d, 32767
 jle .skip
+
 psrld   m1, 2
 mov r4d, r4m
 movdm0, r4d ; m0 = shift
___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel
___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel


Re: [x265] [PATCH] encoder: enable VUI; set HRD parameters in SPS

2014-02-20 Thread dave

# HG changeset patch
# User Deepthi Nandakumar deep...@multicorewareinc.com
# Date 1392883371 -19800
# Node ID 3934859d310bcc3f54ad1855dd94bd71eb0e7457
# Parent  3389061b75a486e004409ab628c46fed39d03b72
encoder: enable VUI; set HRD parameters in SPS.
You can now add a VUI on the cli.  Use --vui to get a vui will all 
default values or any vui specific options including --nal-hrd to 
generate a VUI with an HRD though currently the HRD will only have all 
default values.

This patch enables pictureTimingSEI, but enabling 
decodingUnitInfoSEI/bufferingPeriodSEI can affect
this flow. Any further info/suggestions welcomed.

diff -r 3389061b75a4 -r 3934859d310b source/encoder/encoder.cpp
--- a/source/encoder/encoder.cppWed Feb 19 17:03:21 2014 -0600
+++ b/source/encoder/encoder.cppThu Feb 20 13:32:51 2014 +0530
@@ -1459,13 +1459,13 @@
  m_bUseASR = false; // adapt search range based on temporal distances
  m_recoveryPointSEIEnabled = 0;
  m_bufferingPeriodSEIEnabled = 0;
-m_pictureTimingSEIEnabled = 0;
+m_pictureTimingSEIEnabled = 1;
  m_displayOrientationSEIAngle = 0;
  m_gradualDecodingRefreshInfoEnabled = 0;
  m_decodingUnitInfoSEIEnabled = 0;
  m_useScalingListId = 0;
  m_activeParameterSetsSEIEnabled = 0;
-m_vuiParametersPresentFlag = false;
+m_vuiParametersPresentFlag = true;
  m_minSpatialSegmentationIdc = 0;
  m_aspectRatioIdc = 0;
  m_sarWidth = 0;
diff -r 3389061b75a4 -r 3934859d310b source/encoder/frameencoder.cpp
--- a/source/encoder/frameencoder.cpp   Wed Feb 19 17:03:21 2014 -0600
+++ b/source/encoder/frameencoder.cpp   Thu Feb 20 13:32:51 2014 +0530
@@ -138,7 +138,7 @@
  m_sps.setNumLongTermRefPicSPS(0);
  if (m_cfg-getPictureTimingSEIEnabled() || 
m_cfg-getDecodingUnitInfoSEIEnabled())
  {
-m_sps.setHrdParameters(m_cfg-param.fpsNum, m_cfg-param.fpsDenom, 0, 
m_cfg-param.rc.bitrate, m_cfg-param.bframes  0);
+m_sps.setHrdParameters(m_cfg-param.fpsNum, m_cfg-param.fpsDenom, 1, 
m_cfg-param.rc.bitrate, m_cfg-param.bframes  0);
  }
  if (m_cfg-getBufferingPeriodSEIEnabled() || 
m_cfg-getPictureTimingSEIEnabled() || m_cfg-getDecodingUnitInfoSEIEnabled())
  {
___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel


___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel


[x265] [PATCH] user: David T Yuen dtyx...@gmail.com

2014-02-20 Thread dtyx265
# HG changeset patch
# User David T Yuen dtyx...@gmail.com
# Date 1392928085 28800
# Node ID 648dcea58040691e9ca56dd71e33b619e8769fa8
# Parent  db784e7cf7d8b58723dd9e5d6bea8c46c5102a15
user: David T Yuen dtyx...@gmail.com
branch 'default'
changed source/common/common.cpp

Added parameter checking for VUI parameters

diff -r db784e7cf7d8 -r 648dcea58040 source/common/common.cpp
--- a/source/common/common.cpp  Thu Feb 20 08:24:11 2014 -0800
+++ b/source/common/common.cpp  Thu Feb 20 12:28:05 2014 -0800
@@ -549,6 +549,29 @@
 }
 
 CHECK(param-bEnableWavefront  0, WaveFrontSynchro cannot be negative);
+CHECK((param-aspectRatioIdc  0 || param-aspectRatioIdc  16) 
+   param-aspectRatioIdc != 255, Sample Aspect Ratio must be 0-16 or 
255);
+CHECK(param-sarWidth  0, Sample Aspect Ratio width must be greater than 
0);
+CHECK(param-sarHeight  0, Sample Aspect Ratio height must be greater 
than 0);
+CHECK(param-videoFormat  0 || param-videoFormat  5,
+ Video Format must be Component component, pal, ntsc, secam, mac or 
undef);
+CHECK(param-colorPrimaries  0 || param-colorPrimaries  9
+  || param-colorPrimaries == 3, Color Primaries must be undef, 
bt709, bt470m, bt470bg, smpte170m, smpte240m, film or bt2020);
+CHECK(param-transferCharacteristics  0 || param-transferCharacteristics 
 15
+  || param-transferCharacteristics == 3, Transfer Characteristics 
must be undef, bt709, bt470m, bt470bg, smpte170m, smpte240m, linear, log100, 
log316, iec61966-2-4, bt1361e, iec61966-2-1, bt2020-10 or bt2020-12);
+CHECK(param-matrixCoeffs  0 || param-matrixCoeffs  10 || 
param-matrixCoeffs == 3,
+  Matrix Coefficients must be undef, bt709, fcc, bt470bg, smpte170m, 
smpte240m, GBR, YCgCo, bt2020nc or bt2020c);
+CHECK(param-chromaSampleLocTypeTopField  0 || 
param-chromaSampleLocTypeTopField  5,
+  Chroma Sample Location Type Top Field must be 0-5);
+CHECK(param-chromaSampleLocTypeBottomField  0 || 
param-chromaSampleLocTypeBottomField  5,
+  Chroma Sample Location Type Bottom Field must be 0-5);
+CHECK(param-defDispWinLeftOffset  0, Default Display Window Left Offset 
must be 0 or greater);
+CHECK(param-defDispWinRightOffset  0, Default Display Window Right 
Offset must be 0 or greater)
+;
+CHECK(param-defDispWinTopOffset  0, Default Display Window Top Offset 
must be 0 or greater)
+;
+CHECK(param-defDispWinBottomOffset  0, Default Display Window Bottom 
Offset must be 0 or greater)
+;
 return check_failed;
 }
 
@@ -849,6 +872,8 @@
 p-videoFormat = 4;
 else if (!strcmp(value, undef))
 p-videoFormat = 5;
+else
+p-videoFormat = -1;
 }
 OPT(range)
 {
@@ -877,6 +902,8 @@
 p-colorPrimaries = 8;
 else if (!strcmp(value, bt2020))
 p-colorPrimaries = 9;
+else
+p-colorPrimaries = -1;
 }
 OPT(transfer)
 {
@@ -911,6 +938,8 @@
 p-transferCharacteristics = 14;
 else if (!strcmp(value, bt2020-12))
 p-transferCharacteristics = 15;
+else
+p-transferCharacteristics = -1;
 }
 OPT(colormatrix)
 {
@@ -937,6 +966,8 @@
 p-matrixCoeffs = 9;
 else if (!strcmp(value, bt2020c))
 p-matrixCoeffs = 10;
+else
+p-matrixCoeffs = -1;
 }
 OPT(chromaloc)
 {
___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel


Re: [x265] [PATCH] Added command line options to generate a VUI and add it to the coded bitstream

2014-02-20 Thread Derek Buitenhuis
On 2/19/2014 6:01 PM, dtyx...@gmail.com wrote:
 +bool getTilesFixedStructureFlag() { return m_tilesFixedStructureFlag; }
 +
 +void setTilesFixedStructureFlag(bool i) { m_tilesFixedStructureFlag = i; 
 }

I know this patch is pushed, but I feel it's a good time to say:

This is stupid. This isn't early 2000s Java.

Just make them public members -- this is needless java-style boilerplate cruft.

- Derek
___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel


[x265] [PATCH] Added better vui parameter checking

2014-02-20 Thread dtyx265
# HG changeset patch
# User David T Yuen dtyx...@gmail.com
# Date 1392935264 28800
# Node ID fe18d3e9e8ccca9ba618e1292e1a4e415e0d3547
# Parent  549f5bf10211da2b5b427d4ea0a87eb7ef20341b
Added better vui parameter checking

diff -r 549f5bf10211 -r fe18d3e9e8cc source/common/common.cpp
--- a/source/common/common.cpp  Thu Feb 20 14:47:41 2014 -0600
+++ b/source/common/common.cpp  Thu Feb 20 14:27:44 2014 -0800
@@ -549,6 +549,49 @@
 }
 
 CHECK(param-bEnableWavefront  0, WaveFrontSynchro cannot be negative);
+CHECK((param-aspectRatioIdc  0
+  || param-aspectRatioIdc  16)
+  param-aspectRatioIdc != 255,
+ Sample Aspect Ratio must be 0-16 or 255);
+CHECK(param-sarWidth  0,
+ Sample Aspect Ratio width must be greater than 0);
+CHECK(param-sarHeight  0,
+ Sample Aspect Ratio height must be greater than 0);
+CHECK(param-bEnableOverscanInfoPresentFlag  0,
+ Overscan must be show, crop or undef);
+CHECK(param-videoFormat  0 || param-videoFormat  5,
+ Video Format must be Component component,
+  pal, ntsc, secam, mac or undef);
+CHECK(param-colorPrimaries  0
+  || param-colorPrimaries  9
+  || param-colorPrimaries == 3,
+ Color Primaries must be undef, bt709, bt470m,
+  bt470bg, smpte170m, smpte240m, film or bt2020);
+CHECK(param-transferCharacteristics  0
+  || param-transferCharacteristics  15
+  || param-transferCharacteristics == 3,
+ Transfer Characteristics must be undef, bt709, bt470m, bt470bg,
+  smpte170m, smpte240m, linear, log100, log316, iec61966-2-4, 
bt1361e,
+  iec61966-2-1, bt2020-10 or bt2020-12);
+CHECK(param-matrixCoeffs  0
+  || param-matrixCoeffs  10
+  || param-matrixCoeffs == 3,
+  Matrix Coefficients must be undef, bt709, fcc, bt470bg, smpte170m,
+   smpte240m, GBR, YCgCo, bt2020nc or bt2020c);
+CHECK(param-chromaSampleLocTypeTopField  0
+  || param-chromaSampleLocTypeTopField  5,
+  Chroma Sample Location Type Top Field must be 0-5);
+CHECK(param-chromaSampleLocTypeBottomField  0
+  || param-chromaSampleLocTypeBottomField  5,
+  Chroma Sample Location Type Bottom Field must be 0-5);
+CHECK(param-defDispWinLeftOffset  0,
+  Default Display Window Left Offset must be 0 or greater);
+CHECK(param-defDispWinRightOffset  0,
+  Default Display Window Right Offset must be 0 or greater);
+CHECK(param-defDispWinTopOffset  0,
+  Default Display Window Top Offset must be 0 or greater);
+CHECK(param-defDispWinBottomOffset  0,
+  Default Display Window Bottom Offset must be 0 or greater);
 return check_failed;
 }
 
@@ -832,6 +875,8 @@
 p-bEnableOverscanInfoPresentFlag = bvalue;
 p-bEnableOverscanAppropriateFlag = bvalue;
 }
+else
+p-bEnableOverscanInfoPresentFlag = -1;
 }
 OPT(videoformat)
 {
@@ -849,6 +894,8 @@
 p-videoFormat = 4;
 else if (!strcmp(value, undef))
 p-videoFormat = 5;
+else
+p-videoFormat = -1;
 }
 OPT(range)
 {
@@ -877,6 +924,8 @@
 p-colorPrimaries = 8;
 else if (!strcmp(value, bt2020))
 p-colorPrimaries = 9;
+else
+p-colorPrimaries = -1;
 }
 OPT(transfer)
 {
@@ -911,6 +960,8 @@
 p-transferCharacteristics = 14;
 else if (!strcmp(value, bt2020-12))
 p-transferCharacteristics = 15;
+else
+p-transferCharacteristics = -1;
 }
 OPT(colormatrix)
 {
@@ -937,6 +988,8 @@
 p-matrixCoeffs = 9;
 else if (!strcmp(value, bt2020c))
 p-matrixCoeffs = 10;
+else
+p-matrixCoeffs = -1;
 }
 OPT(chromaloc)
 {
___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel


Re: [x265] [PATCH] encoder: enable VUI; set HRD parameters in SPS

2014-02-20 Thread Deepthi Nandakumar
This patch has been superseded by the one that enables it from the CLI. I
removed this one from patch list on patchworks.


On Fri, Feb 21, 2014 at 1:14 AM, Steve Borho st...@borho.org wrote:




 On Thu, Feb 20, 2014 at 12:17 PM, dave dtyx...@gmail.com wrote:

 # HG changeset patch
 # User Deepthi Nandakumar deep...@multicorewareinc.com
 # Date 1392883371 -19800
 # Node ID 3934859d310bcc3f54ad1855dd94bd71eb0e7457
 # Parent  3389061b75a486e004409ab628c46fed39d03b72
 encoder: enable VUI; set HRD parameters in SPS.

 You can now add a VUI on the cli.  Use --vui to get a vui will all
 default values or any vui specific options including --nal-hrd to generate
 a VUI with an HRD though currently the HRD will only have all default
 values.


 agreed, we should try to follow x264's CLI and defaults as much as
 possible for new features that come online.

 --
 Steve Borho

 ___
 x265-devel mailing list
 x265-devel@videolan.org
 https://mailman.videolan.org/listinfo/x265-devel


___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel


Re: [x265] (no subject)

2014-02-20 Thread chen
At 2014-02-21 10:13:56,Satoshi Nakagawa nakagawa...@oki.com wrote:
# HG changeset patch
# User Satoshi Nakagawa nakagawa...@oki.com
# Date 1392948676 -32400
#  Fri Feb 21 11:11:16 2014 +0900
# Node ID 66d8cb6573f27b29a9dc92ec480c635f0de48c03
# Parent  894bde574bc1678471e0c23ceb381a806768ea95
asm: update count_nonzero, add testbench

diff -r 894bde574bc1 -r 66d8cb6573f2 source/common/x86/pixel-util8.asm
--- a/source/common/x86/pixel-util8.asm Thu Feb 20 17:18:42 2014 -0600
+++ b/source/common/x86/pixel-util8.asm Fri Feb 21 11:11:16 2014 +0900
@@ -1240,11 +1240,12 @@
 ; int count_nonzero(const int32_t *quantCoeff, int numCoeff);
 ;-
 INIT_XMM sse2
-cglobal count_nonzero, 2,3,4
+cglobal count_nonzero, 2,2,4
 pxorm0, m0
-pxorm1, m1
-mov r2d, r1d
 shr r1d, 3
+movdm1, r1d
+pshufd  m1, m1, 0
+packssdwm1, m1
packssdw is expendsive instruction, pshuflw+punpcklqdq is better.
 
 .loop
 movam2, [r0]
@@ -1252,16 +1253,13 @@
 add r0, 32
 packssdwm2, m3
 pcmpeqw m2, m0
-psrlw   m2, 15
-packsswbm2, m2
-psadbw  m2, m0
-paddd   m1, m2
+paddw   m1, m2
 dec r1d
-jnz.loop
-
-movdr1d, m1
-sub r2d, r1d
-mov eax, r2d
+jnz .loop
+
+packuswbm1, m1
+psadbw  m1, m0
+movdeax, m1
 
 RET
___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel


Re: [x265] [PATCH] Added command line options to generate a VUI and add it to the coded bitstream

2014-02-20 Thread Steve Borho
On Thu, Feb 20, 2014 at 5:31 PM, dave dtyx...@gmail.com wrote:
snip
 I have already found a minor nit here and there so you will probably see me
 submit a relevant tweak here and there.

Ok

 There looks like a lot of cruft in the design of x265, mostly from the HM
 base, If you want get rid of it then there needs to be a plan of coming up
 with something to replace it with and I am not a fan of simply replacing
 x265's C++ classes with C structures that are copies of those classes.  If
 this is something that you need somebody to work on then I would like to
 volunteer.

Indeed. No one else has started this work; and it is definitely in
need of rewrites in many places.  If you enjoy that sort of thing then
by all means please do.

--
Steve
___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel


[x265] asm: update count_nonzero, add testbench

2014-02-20 Thread Satoshi Nakagawa

+pshufd  m1, m1, 0
+packssdwm1, m1
 packssdw is expendsive instruction, pshuflw+punpcklqdq is better.

revised, thanks.


# HG changeset patch
# User Satoshi Nakagawa nakagawa...@oki.com
# Date 1392953002 -32400
#  Fri Feb 21 12:23:22 2014 +0900
# Node ID e4a80e46bd80e7d516dc881da7f38737c0071ccf
# Parent  894bde574bc1678471e0c23ceb381a806768ea95
asm: update count_nonzero, add testbench

diff -r 894bde574bc1 -r e4a80e46bd80 source/common/x86/pixel-util8.asm
--- a/source/common/x86/pixel-util8.asm Thu Feb 20 17:18:42 2014 -0600
+++ b/source/common/x86/pixel-util8.asm Fri Feb 21 12:23:22 2014 +0900
@@ -1240,11 +1240,12 @@
 ; int count_nonzero(const int32_t *quantCoeff, int numCoeff);
 ;-
 INIT_XMM sse2
-cglobal count_nonzero, 2,3,4
+cglobal count_nonzero, 2,2,4
 pxorm0, m0
-pxorm1, m1
-mov r2d, r1d
 shr r1d, 3
+movdm1, r1d
+pshuflw m1, m1, 0
+punpcklqdq  m1, m1
 
 .loop
 movam2, [r0]
@@ -1252,16 +1253,13 @@
 add r0, 32
 packssdwm2, m3
 pcmpeqw m2, m0
-psrlw   m2, 15
-packsswbm2, m2
-psadbw  m2, m0
-paddd   m1, m2
+paddw   m1, m2
 dec r1d
-jnz.loop
-
-movdr1d, m1
-sub r2d, r1d
-mov eax, r2d
+jnz .loop
+
+packuswbm1, m1
+psadbw  m1, m0
+movdeax, m1
 
 RET
 
diff -r 894bde574bc1 -r e4a80e46bd80 source/test/mbdstharness.cpp
--- a/source/test/mbdstharness.cpp  Thu Feb 20 17:18:42 2014 -0600
+++ b/source/test/mbdstharness.cpp  Fri Feb 21 12:23:22 2014 +0900
@@ -380,6 +380,41 @@
 return true;
 }
 
+bool MBDstHarness::check_count_nonzero_primitive(count_nonzero_t ref, 
count_nonzero_t opt)
+{
+ALIGN_VAR_32(int32_t, qcoeff[32 * 32]);
+
+for (int i = 0; i  4; i++)
+{
+int log2TrSize = i + 2;
+int num = 1  (log2TrSize * 2);
+int mask = num - 1;
+
+for (int n = 0; n = num; n++)
+{
+memset(qcoeff, 0, num * sizeof(int32_t));
+
+for (int j = 0; j  n; j++)
+{
+int k = rand()  mask;
+while (qcoeff[k])
+{
+k = (k + 11)  mask;
+}
+qcoeff[k] = rand() - RAND_MAX / 2;
+}
+
+int refval = ref(qcoeff, num);
+int optval = opt(qcoeff, num);
+
+if (refval != optval)
+return false;
+}
+}
+
+return true;
+}
+
 bool MBDstHarness::testCorrectness(const EncoderPrimitives ref, const 
EncoderPrimitives opt)
 {
 for (int i = 0; i  NUM_DCTS; i++)
@@ -424,6 +459,15 @@
 }
 }
 
+if (opt.count_nonzero)
+{
+if (!check_count_nonzero_primitive(ref.count_nonzero, 
opt.count_nonzero))
+{
+printf(count_nonzero: Failed!\n);
+return false;
+}
+}
+
 return true;
 }
 
@@ -465,4 +509,13 @@
 int dummy = -1;
 REPORT_SPEEDUP(opt.quant, ref.quant, mintbuf1, mintbuf2, mintbuf3, 
mintbuf4, 23, 23785, 32 * 32, dummy);
 }
+
+if (opt.count_nonzero)
+{
+for (int i = 4; i = 32; i = 1)
+{
+printf(count_nonzero[%dx%d], i, i);
+REPORT_SPEEDUP(opt.count_nonzero, ref.count_nonzero, mbufidct, i * 
i)
+}
+}
 }
diff -r 894bde574bc1 -r e4a80e46bd80 source/test/mbdstharness.h
--- a/source/test/mbdstharness.hThu Feb 20 17:18:42 2014 -0600
+++ b/source/test/mbdstharness.hFri Feb 21 12:23:22 2014 +0900
@@ -43,6 +43,7 @@
 bool check_quant_primitive(quant_t ref, quant_t opt);
 bool check_dct_primitive(dct_t ref, dct_t opt, int width);
 bool check_idct_primitive(idct_t ref, idct_t opt, int width);
+bool check_count_nonzero_primitive(count_nonzero_t ref, count_nonzero_t 
opt);
 
 public:
 
___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel


Re: [x265] APPCRASH in x265 0.7+207 while encoding in preset 'slow' or slower...

2014-02-20 Thread Steve Borho
On Wed, Feb 19, 2014 at 1:03 AM, Mario *LigH* Rohkrämer cont...@ligh.de wrote:
 Am 18.02.2014, 18:34 Uhr, schrieb Steve Borho st...@borho.org:


 On Tue, Feb 18, 2014 at 8:33 AM, JMK three4tee...@coldmail.nu wrote:

 @ the MCW x265 team:

 wouldn't it be the time to setup a testbot or something?


 we do; and we've found that the 4:4:4 changes caused problems for 10bit
 builds that we're looking through.

 Are you using a HIGH_BIT_DEPTH build, Mario?


 No, a plain 8bpp 64bit Windows build (GCC 4.8.2 cross-compile).

 Rumours say that GCC can be over-aggressive in some optimizations.

Hi Mario,

There were a couple of bugs that were causing problems even at 8bpp.
I believe they are all resolved on the default branch tip now.  There
haven't been any GCC specific bugs since the very early days of the
project (if you don't count build breakages, which I hope are becoming
rare).

-- 
Steve Borho
___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel


[x265] clear m_tempPel only once

2014-02-20 Thread Satoshi Nakagawa
# HG changeset patch
# User Satoshi Nakagawa nakagawa...@oki.com
# Date 1392962042 -32400
#  Fri Feb 21 14:54:02 2014 +0900
# Node ID 3706b57addade77ee6d06bd95600c99e80abb93a
# Parent  0c19c44af2d3a8825d804597f1c2f82e32e4d4b7
clear m_tempPel only once

diff -r 0c19c44af2d3 -r 3706b57addad source/Lib/TLibEncoder/TEncSearch.cpp
--- a/source/Lib/TLibEncoder/TEncSearch.cpp Fri Feb 21 12:23:22 2014 +0900
+++ b/source/Lib/TLibEncoder/TEncSearch.cpp Fri Feb 21 14:54:02 2014 +0900
@@ -79,7 +79,7 @@
 
 TEncSearch::~TEncSearch()
 {
-delete [] m_tempPel;
+X265_FREE(m_tempPel);
 
 if (m_cfg)
 {
@@ -135,7 +135,8 @@
 
 initTempBuff(cfg-param.internalCsp);
 
-m_tempPel = new Pel[g_maxCUWidth * g_maxCUHeight];
+m_tempPel = X265_MALLOC(Pel, g_maxCUWidth * g_maxCUHeight);
+memset(m_tempPel, 0, sizeof(Pel) * g_maxCUWidth * g_maxCUHeight);
 
 const uint32_t numLayersToAllocate = cfg-getQuadtreeTULog2MaxSize() - 
cfg-getQuadtreeTULog2MinSize() + 1;
 m_qtTempCoeffY  = new TCoeff*[numLayersToAllocate];
@@ -3564,8 +3565,6 @@
 const uint32_t numSamplesLuma = 1  (trSizeLog2  1);
 const uint32_t numSamplesChroma = 1  (trSizeCLog2  1);
 
-::memset(m_tempPel, 0, sizeof(Pel) * numSamplesLuma); // not necessary 
needed for inside of recursion (only at the beginning)
-
 int partSize = partitionFromSizes(trWidth, trHeight);
 uint32_t distY = 
primitives.sse_sp[partSize](resiYuv-getLumaAddr(absTUPartIdx), 
resiYuv-m_width, m_tempPel, trWidth);
 
___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel


[x265] [PATCH 2 of 6] vbv: enable vbvLookahead for Keyframes; accumulate frame rowSatds from lowres rowSatds

2014-02-20 Thread aarthi
# HG changeset patch
# User Aarthi Thirumalai
# Date 1392896018 -19800
#  Thu Feb 20 17:03:38 2014 +0530
# Node ID cbfb5bc44d6b8b74dc299a1c31bcfaa71dde2865
# Parent  d90c9b27f413a4b00079a31ca7b0411a5fb8eb19
vbv: enable vbvLookahead for Keyframes; accumulate frame rowSatds from lowres 
rowSatds.

diff -r d90c9b27f413 -r cbfb5bc44d6b source/encoder/slicetype.cpp
--- a/source/encoder/slicetype.cpp  Fri Feb 21 10:29:03 2014 +0530
+++ b/source/encoder/slicetype.cpp  Thu Feb 20 17:03:38 2014 +0530
@@ -171,22 +171,48 @@
 cost.estimateFrameCost(frames, p0, p1, b, false);
 cost.flush();
 }
+if (cfg-param.rc.cuTree)
+{
+pic-m_lowres.satdCost = frameCostRecalculate(frames, p0, p1, b);
+if (b  cfg-param.rc.vbvBufferSize)
+frameCostRecalculate(frames,b, b, b);
+}
 
-if (cfg-param.rc.cuTree)
-pic-m_lowres.satdCost = frameCostRecalculate(frames, p0, p1, b);
 else if (cfg-param.rc.aqMode)
 pic-m_lowres.satdCost = pic-m_lowres.costEstAq[b - p0][p1 - b];
 else
 pic-m_lowres.satdCost = pic-m_lowres.costEst[b - p0][p1 - b];
+
+if (cfg-param.rc.vbvBufferSize  0  cfg-param.rc.vbvMaxBitrate  0)
+{
+pic-m_lowres.lowresCostForRc = pic-m_lowres.lowresCosts[b - p0][p1 - 
b];
+uint32_t lowresRow = 0 , lowresCol = 0, lowresCuIdx = 0, sum = 0;
+uint32_t scale = cfg-param.maxCUSize / (2 * X265_LOWRES_CU_SIZE);
+uint32_t widthInLowresCu = (uint32_t)widthInCU, heightInLowresCu = 
(uint32_t)heightInCU;
+
+for (uint32_t row = 0; row  pic-getFrameHeightInCU(); row++)
+{
+lowresRow = row * scale;
+for (uint32_t cnt = 0 ; cnt  scale  lowresRow  
heightInLowresCu; lowresRow++, cnt++)
+{
+sum = 0;
+lowresCuIdx = lowresRow * widthInLowresCu ;
+for (lowresCol = 0; lowresCol  widthInLowresCu; lowresCol++, 
lowresCuIdx++)
+{
+sum +=  pic-m_lowres.lowresCostForRc[lowresCuIdx];
+}
+pic-m_rowSatdForVbv[row] += sum;
+}
+  }
+  }
 return pic-m_lowres.satdCost;
 }
-
 void Lookahead::slicetypeDecide()
 {
 Lowres *frames[X265_LOOKAHEAD_MAX];
 TComPic *list[X265_LOOKAHEAD_MAX];
 TComPic *ipic = inputQueue.first();
-
+bool isKeyFrameAnalyse = (cfg-param.rc.cuTree || 
(cfg-param.rc.vbvBufferSize  cfg-param.lookaheadDepth));
 if (!est.rows  ipic)
 est.init(cfg, ipic);
 
@@ -371,8 +397,11 @@
 outputQueue.pushBack(*list[i]);
 }
 }
+if (isKeyFrameAnalyse  IS_X265_TYPE_I(lastNonB-sliceType))
+{
+slicetypeAnalyse(frames,true);
+}
 }
-
 void Lookahead::vbvLookahead(Lowres **frames, int numFrames, int keyframe)
 {
 int prevNonB = 0, curNonB = 1, idx = 0;
___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel


[x265] [PATCH 1 of 6] vbv: Introduce states to hold row data for row level VBV ratecontrol

2014-02-20 Thread aarthi
# HG changeset patch
# User Aarthi Thirumalai
# Date 1392958743 -19800
#  Fri Feb 21 10:29:03 2014 +0530
# Node ID d90c9b27f413a4b00079a31ca7b0411a5fb8eb19
# Parent  0c19c44af2d3a8825d804597f1c2f82e32e4d4b7
vbv: Introduce states to hold row data for row level VBV ratecontrol.

diff -r 0c19c44af2d3 -r d90c9b27f413 source/Lib/TLibCommon/TComDataCU.cpp
--- a/source/Lib/TLibCommon/TComDataCU.cpp  Fri Feb 21 12:23:22 2014 +0900
+++ b/source/Lib/TLibCommon/TComDataCU.cpp  Fri Feb 21 10:29:03 2014 +0530
@@ -99,8 +99,8 @@
 m_mvpIdx[0] = NULL;
 m_mvpIdx[1] = NULL;
 m_chromaFormat = 0;
+m_baseQp = 0;
 }
-
 TComDataCU::~TComDataCU()
 {}
 
@@ -235,6 +235,7 @@
 m_totalBits= 0;
 m_numPartitions= pic-getNumPartInCU();
 int qp = pic-m_lowres.invQscaleFactor ? 
pic-getCU(getAddr())-getQP(0) : m_slice-getSliceQp();
+m_baseQp   = pic-getCU(getAddr())-m_baseQp;
 for (int i = 0; i  4; i++)
 {
 m_avgCost[i] = 0;
diff -r 0c19c44af2d3 -r d90c9b27f413 source/Lib/TLibCommon/TComDataCU.h
--- a/source/Lib/TLibCommon/TComDataCU.hFri Feb 21 12:23:22 2014 +0900
+++ b/source/Lib/TLibCommon/TComDataCU.hFri Feb 21 10:29:03 2014 +0530
@@ -179,7 +179,7 @@
 uint64_t  m_avgCost[4];  // stores the avg cost of CU's in frame 
for each depth
 uint32_t  m_count[4];
 uint64_t  m_sa8dCost;
-
+doublem_baseQp;  //Qp of Cu set from RateControl/Vbv.
 // 
---
 // create / destroy / initialize / copy
 // 
---
diff -r 0c19c44af2d3 -r d90c9b27f413 source/Lib/TLibCommon/TComPic.cpp
--- a/source/Lib/TLibCommon/TComPic.cpp Fri Feb 21 12:23:22 2014 +0900
+++ b/source/Lib/TLibCommon/TComPic.cpp Fri Feb 21 10:29:03 2014 +0530
@@ -56,6 +56,13 @@
 , m_bUsedByCurr(false)
 , m_bIsLongTerm(false)
 , m_bCheckLTMSB(false)
+, m_rowDiagQp(NULL)
+, m_rowDiagQScale(NULL)
+, m_rowDiagSatd(NULL)
+, m_rowEncodedBits(NULL)
+, m_numEncodedCusPerRow(NULL)
+, m_rowSatdForVbv(NULL)
+, m_cuCostsForVbv(NULL)
 {
 m_reconRowCount = 0;
 m_countRefEncoders = 0;
@@ -69,9 +76,11 @@
 m_ssimCnt = 0;
 m_frameTime = 0.0;
 m_elapsedCompressTime = 0.0;
+m_qpaAq = 0;
+m_qpaRc = 0;
+m_avgQpRc = 0;
 m_bChromaPlanesExtended = false;
 }
-
 TComPic::~TComPic()
 {}
 
@@ -94,9 +103,47 @@
 ok = m_origPicYuv-create(cfg-param.sourceWidth, 
cfg-param.sourceHeight, cfg-param.internalCsp, g_maxCUWidth, g_maxCUHeight, 
g_maxCUDepth);
 ok = m_reconPicYuv-create(cfg-param.sourceWidth, 
cfg-param.sourceHeight, cfg-param.internalCsp, g_maxCUWidth, g_maxCUHeight, 
g_maxCUDepth);
 ok = m_lowres.create(m_origPicYuv, cfg-param.bframes, 
cfg-param.rc.aqMode);
+
+if (ok  cfg-param.rc.vbvBufferSize  0  cfg-param.rc.vbvMaxBitrate  
0)
+{
+int numRows = m_picSym-getFrameHeightInCU();
+int numCols = m_picSym-getFrameWidthInCU();
+CHECKED_MALLOC(m_rowDiagQp, double, numRows);
+CHECKED_MALLOC(m_rowDiagQScale, double, numRows);
+CHECKED_MALLOC(m_rowDiagSatd, uint32_t, numRows);
+CHECKED_MALLOC(m_rowEncodedBits, uint32_t, numRows);
+CHECKED_MALLOC(m_numEncodedCusPerRow, uint32_t, numRows);
+CHECKED_MALLOC(m_rowSatdForVbv, uint32_t, numRows);
+CHECKED_MALLOC(m_cuCostsForVbv, uint32_t, numRows * numCols);
+CHECKED_MALLOC(m_qpaRc, double, numRows);
+CHECKED_MALLOC(m_qpaAq, int, numRows);
+reInit(cfg);
+}
+
+return ok;
+
+fail :
+ok = false;
 return ok;
 }
 
+void TComPic::reInit(TEncCfg* cfg)
+{
+if (cfg-param.rc.vbvBufferSize  0  cfg-param.rc.vbvMaxBitrate  0)
+{
+int numRows = m_picSym-getFrameHeightInCU();
+int numCols = m_picSym-getFrameWidthInCU();
+memset(m_rowDiagQp, 0, numRows * sizeof(double));
+memset(m_rowDiagQScale, 0, numRows * sizeof(double));
+memset(m_rowDiagSatd, 0, numRows * sizeof(uint32_t));
+memset(m_rowEncodedBits, 0, numRows * sizeof(uint32_t));
+memset(m_numEncodedCusPerRow, 0, numRows * sizeof(uint32_t));
+memset(m_rowSatdForVbv, 0, numRows * sizeof(uint32_t));
+memset(m_cuCostsForVbv, 0,  numRows * numCols * sizeof(uint32_t));
+memset(m_qpaRc, 0, numRows * sizeof(double));
+memset(m_qpaAq, 0, numRows * sizeof(uint32_t));
+}
+}
 void TComPic::destroy(int bframes)
 {
 if (m_picSym)
@@ -119,8 +166,16 @@
 delete m_reconPicYuv;
 m_reconPicYuv = NULL;
 }
+m_lowres.destroy(bframes);
 
-m_lowres.destroy(bframes);
+X265_FREE(m_rowDiagQp);
+X265_FREE(m_rowDiagQScale);
+X265_FREE(m_rowDiagSatd);
+X265_FREE(m_rowEncodedBits);
+X265_FREE(m_numEncodedCusPerRow);
+

[x265] [PATCH 0 of 6 ] vbv ratecontrol revised.

2014-02-20 Thread aarthi

___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel


[x265] [PATCH 5 of 6] vbv: implement row wise vbvRateControl at each row diagonal

2014-02-20 Thread aarthi
# HG changeset patch
# User Aarthi Thirumalai
# Date 1392962748 -19800
#  Fri Feb 21 11:35:48 2014 +0530
# Node ID 22d4811e0676fbec5e8bf96d99be9b98020fc89f
# Parent  d058d4dd84a547b9b72a043ed065a94220b2f32b
vbv: implement row wise vbvRateControl at each row diagonal

diff -r d058d4dd84a5 -r 22d4811e0676 source/encoder/ratecontrol.cpp
--- a/source/encoder/ratecontrol.cppThu Feb 20 18:09:53 2014 +0530
+++ b/source/encoder/ratecontrol.cppFri Feb 21 11:35:48 2014 +0530
@@ -220,6 +220,10 @@
 
 // validate for cfg-param.rc, maybe it is need to add a function like 
x265_parameters_valiate()
 cfg-param.rc.rfConstant = Clip3((double)-QP_BD_OFFSET, (double)51, 
cfg-param.rc.rfConstant);
+cfg-param.rc.rfConstantMax = Clip3((double)-QP_BD_OFFSET, (double)51, 
cfg-param.rc.rfConstantMax);
+rateFactorMaxIncrement = 0;
+vbvMinRate = 0;
+
 if (cfg-param.rc.rateControlMode == X265_RC_CRF)
 {
 cfg-param.rc.qp = (int)cfg-param.rc.rfConstant + QP_BD_OFFSET;
@@ -229,6 +233,15 @@
 double mbtree_offset = cfg-param.rc.cuTree ? (1.0 - 
cfg-param.rc.qCompress) * 13.5 : 0;
 rateFactorConstant = pow(baseCplx, 1 - qCompress) /
 qp2qScale(cfg-param.rc.rfConstant + mbtree_offset + QP_BD_OFFSET);
+if (cfg-param.rc.rfConstantMax)
+{
+rateFactorMaxIncrement = cfg-param.rc.rfConstantMax - 
cfg-param.rc.rfConstant;
+if (rateFactorMaxIncrement = 0)
+{
+x265_log(cfg-param, X265_LOG_WARNING, CRF max must be 
greater than CRF\n);
+rateFactorMaxIncrement = 0;
+}
+}
 }
 
 isAbr = cfg-param.rc.rateControlMode != X265_RC_CQP; // later add 2pass 
option
@@ -448,6 +461,7 @@
 bool i1 = nextRefSlice-getSliceType() == I_SLICE;
 int dt0 = abs(curSlice-getPOC() - prevRefSlice-getPOC());
 int dt1 = abs(curSlice-getPOC() - nextRefSlice-getPOC());
+
 // Skip taking a reference frame before the Scenecut if ABR has been 
reset.
 if (lastAbrResetPoc = 0  !isVbv)
 {
@@ -676,7 +690,7 @@
 continue;
 }
 /* Try to get the buffer no more than 80% filled, but don't 
set an impossible goal. */
-targetFill = Clip3(bufferFill - totalDuration * vbvMaxRate * 
0.5, bufferSize * 0.8, bufferSize);
+targetFill = Clip3(bufferSize * 0.8, bufferSize, bufferFill - 
totalDuration * vbvMaxRate * 0.5);
 if (vbvMinRate  bufferFillCur  targetFill)
 {
 q /= 1.01;
@@ -748,6 +762,169 @@
 
 return Clip3(lmin1, lmax1, q);
 }
+ double RateControl::predictRowsSizeSum(TComPic* pic, double qpVbv, int32_t  
encodedBitsSoFar)
+{
+uint32_t rowSatdCostSoFar = 0 ,totalSatdBits = 0;
+encodedBitsSoFar = 0;
+double qScale = qp2qScale(qpVbv);
+int picType = pic-getSlice()-getSliceType();
+TComPic* refPic = pic-getSlice()-getRefPic(REF_PIC_LIST_0, 0);
+int maxRows = pic-getPicSym()-getFrameHeightInCU();
+for (int row = 0 ; row  maxRows; row++)
+{
+encodedBitsSoFar += pic-m_rowEncodedBits[row];
+rowSatdCostSoFar = pic-m_rowDiagSatd[row];
+uint32_t satdCostForPendingCus = pic-m_rowSatdForVbv[row] - 
rowSatdCostSoFar;
+if (satdCostForPendingCus   0)
+{
+double pred_s = predictSize(rowPred[0], qScale, 
satdCostForPendingCus);
+uint32_t refRowSatdCost= 0 , refRowBits = 0;
+double refQScale=0;
+
+if (picType != I_SLICE)
+{
+uint32_t endCuAddr = pic-getPicSym()-getFrameWidthInCU() * 
(row + 1);
+for (uint32_t cuAddr = pic-m_numEncodedCusPerRow[row] + 1; 
cuAddr  endCuAddr; cuAddr++)
+{
+refRowSatdCost += refPic-m_cuCostsForVbv[cuAddr];
+refRowBits = refPic-getCU(cuAddr)-m_totalBits;
+}
+refQScale = row == maxRows - 1 ? refPic-m_rowDiagQScale[row] 
: refPic-m_rowDiagQScale[row + 1];
+}
+
+if (picType == I_SLICE || qScale = refQScale)
+{
+if (picType == P_SLICE
+ refPic-getSlice()-getSliceType() == picType
+ refQScale  0
+ refRowSatdCost  0)
+{
+if (abs(int32_t(refRowSatdCost - satdCostForPendingCus))  
(int32_t)satdCostForPendingCus / 2)
+{
+double pred_t = refRowBits * satdCostForPendingCus / 
refRowSatdCost
+* refQScale / qScale;
+totalSatdBits += int32_t((pred_s + pred_t) * 0.5);
+}
+}
+
+else
+totalSatdBits += int32_t(pred_s);
+}
+else
+{
+ /* Our QP is lower than the reference! */
+double 

[x265] [PATCH 3 of 6] vbv: Add row predictors, rc states for vbv

2014-02-20 Thread aarthi
# HG changeset patch
# User Aarthi Thirumalai
# Date 1392898673 -19800
#  Thu Feb 20 17:47:53 2014 +0530
# Node ID 9517ea331f3e87f41c687db11de02cb95ded8756
# Parent  cbfb5bc44d6b8b74dc299a1c31bcfaa71dde2865
vbv: Add row predictors, rc states for vbv.

diff -r cbfb5bc44d6b -r 9517ea331f3e source/encoder/encoder.cpp
--- a/source/encoder/encoder.cppThu Feb 20 17:03:38 2014 +0530
+++ b/source/encoder/encoder.cppThu Feb 20 17:47:53 2014 +0530
@@ -218,6 +218,7 @@
 FrameEncoder *encoder = m_frameEncoder[encIdx];
 double bits;
 bits = encoder-m_rce.frameSizePlanned;
+bits = X265_MAX(bits, m_rateControl-frameSizeEstimated);
 rc-bufferFill -= bits;
 rc-bufferFill = X265_MAX(rc-bufferFill, 0);
 rc-bufferFill += encoder-m_rce.bufferRate;
diff -r cbfb5bc44d6b -r 9517ea331f3e source/encoder/ratecontrol.cpp
--- a/source/encoder/ratecontrol.cppThu Feb 20 17:03:38 2014 +0530
+++ b/source/encoder/ratecontrol.cppThu Feb 20 17:47:53 2014 +0530
@@ -243,11 +243,12 @@
 lastNonBPictType = I_SLICE;
 isAbrReset = false;
 lastAbrResetPoc = -1;
+frameSizeEstimated = 0;
 // vbv initialization
 cfg-param.rc.vbvBufferSize = Clip3(0, 200, 
cfg-param.rc.vbvBufferSize);
 cfg-param.rc.vbvMaxBitrate = Clip3(0, 200, 
cfg-param.rc.vbvMaxBitrate);
 cfg-param.rc.vbvBufferInit = Clip3(0.0, 200.0, 
cfg-param.rc.vbvBufferInit);
-
+vbvMinRate = 0;
 if (cfg-param.rc.vbvBufferSize)
 {
 if (cfg-param.rc.rateControlMode == X265_RC_CQP)
@@ -317,6 +318,13 @@
 pred[i].decay = 0.5;
 pred[i].offset = 0.0;
 }
+for (int i = 0; i  4; i++)
+{
+rowPreds[i].coeff = 0.25;
+rowPreds[i].count = 1.0;
+rowPreds[i].decay = 0.5;
+rowPreds[i].offset = 0.0;
+}
 
 predBfromP = pred[0];
 bframes = cfg-param.bframes;
@@ -374,10 +382,12 @@
 rce-bLastMiniGopBFrame = pic-m_lowres.bLastMiniGopBFrame;
 rce-bufferRate = bufferRate;
 rce-poc = curSlice-getPOC();
-
 if (isVbv)
+{
+rowPred[0] = rowPreds[sliceType];
+rowPred[1] = rowPreds[3];
 updateVbvPlan(enc);
-
+}
 if (isAbr) //ABR,CRF
 {
 currentSatd = l-getEstimatedPictureCost(pic);
@@ -385,9 +395,10 @@
 rce-lastSatd = currentSatd; 
 double q = qScale2qp(rateEstimateQscale(pic, rce));
 qp = Clip3(MIN_QP, MAX_MAX_QP, (int)(q + 0.5));
-rce-qpaRc = q;
+rce-qpaRc = pic-m_avgQpRc = q;
 /* copy value of lastRceq into thread local rce struct *to be used in 
RateControlEnd() */
 rce-qRceq = lastRceq;
+rce-qpNoVbv = qpNoVbv;
 accumPQpUpdate();
 }
 else //CQP
@@ -397,9 +408,12 @@
 else
 qp = qpConstant[sliceType];
 }
-
 if (sliceType != B_SLICE)
+{
 lastNonBPictType = sliceType;
+leadingNoBSatd = currentSatd;
+}
+rce-leadingNoBSatd = leadingNoBSatd;
 framesDone++;
 /* set the final QP to slice structure */
 curSlice-setSliceQp(qp);
@@ -461,10 +475,11 @@
 q += pbOffset / 2;
 else
 q += pbOffset;
-
-rce-frameSizePlanned = predictSize(predBfromP, qp2qScale(q), 
leadingNoBSatd);
-
-return qp2qScale(q);
+qpNoVbv = q;
+double qScale = qp2qScale(qpNoVbv);
+rce-frameSizePlanned = predictSize(predBfromP, qScale, 
(double)leadingNoBSatd);
+frameSizeEstimated = rce-frameSizePlanned;
+return qScale;
 }
 else
 {
@@ -555,7 +570,7 @@
 if (qCompress != 1  framesDone == 0)
 q = qp2qScale(ABR_INIT_QP) / fabs(cfg-param.rc.ipFactor);
 }
-
+qpNoVbv = qScale2qp(q);
 double lmin1 = lmin[sliceType];
 double lmax1 = lmax[sliceType];
 q = Clip3(lmin1, lmax1, q);
@@ -840,7 +855,7 @@
 if (rce-bLastMiniGopBFrame)
 {
 if (rce-bframes != 0)
-updatePredictor(predBfromP, qp2qScale(rce-qpaRc), 
(double)rce-lastSatd, (double)bframeBits / rce-bframes);
+updatePredictor(predBfromP, qp2qScale(rce-qpaRc), 
(double)rce-leadingNoBSatd, (double)bframeBits / rce-bframes);
 bframeBits = 0;
 }
 }
diff -r cbfb5bc44d6b -r 9517ea331f3e source/encoder/ratecontrol.h
--- a/source/encoder/ratecontrol.h  Thu Feb 20 17:03:38 2014 +0530
+++ b/source/encoder/ratecontrol.h  Thu Feb 20 17:47:53 2014 +0530
@@ -53,7 +53,7 @@
 int mvBits;
 int bframes;
 int poc;
-
+int64_t leadingNoBSatd;
 bool bLastMiniGopBFrame;
 double blurredComplexity;
 double qpaRc;
@@ -61,8 +61,8 @@
 double frameSizePlanned;
 double bufferRate;
 double movingAvgSum;
+double qpNoVbv;
 };
-
 struct Predictor
 {
 double coeff;
@@ -94,9 +94,10 @@
 bool isVbv;
 Predictor pred[5];
 Predictor predBfromP;
+Predictor rowPreds[4];
+Predictor *rowPred[2];

[x265] [PATCH 6 of 6] vbv: integrate row level vbv ratecontrol at each major row diagonal

2014-02-20 Thread aarthi
# HG changeset patch
# User Aarthi Thirumalai
# Date 1392901254 -19800
#  Thu Feb 20 18:30:54 2014 +0530
# Node ID 72f607f2dc765007149c1d933ec18154f513c5e7
# Parent  22d4811e0676fbec5e8bf96d99be9b98020fc89f
vbv: integrate row level vbv ratecontrol at each major row diagonal.

diff -r 22d4811e0676 -r 72f607f2dc76 source/encoder/frameencoder.cpp
--- a/source/encoder/frameencoder.cpp   Fri Feb 21 11:35:48 2014 +0530
+++ b/source/encoder/frameencoder.cpp   Thu Feb 20 18:30:54 2014 +0530
@@ -1057,6 +1057,7 @@
 CTURow codeRow = m_rows[m_cfg-param.bEnableWavefront ? row : 0];
 const uint32_t numCols = m_pic-getPicSym()-getFrameWidthInCU();
 const uint32_t lineStartCUAddr = row * numCols;
+double qpBase = m_pic-m_avgQpRc;
 for (uint32_t col = curRow.m_completed; col  numCols; col++)
 {
 const uint32_t cuAddr = lineStartCUAddr + col;
@@ -1065,26 +1066,41 @@
 
 codeRow.m_entropyCoder.setEntropyCoder(m_sbacCoder, 
m_pic-getSlice());
 codeRow.m_entropyCoder.resetEntropy();
+TEncSbac *bufSbac = (m_cfg-param.bEnableWavefront  col == 0  row 
 0) ? m_rows[row - 1].m_bufferSbacCoder : NULL;
 
-TEncSbac *bufSbac = (m_cfg-param.bEnableWavefront  col == 0  row 
 0) ? m_rows[row - 1].m_bufferSbacCoder : NULL;
-if (m_cfg-param.rc.aqMode)
+if ((uint32_t)row = col  (row !=0))
+qpBase = m_pic-getCU(cuAddr - numCols + 1)-m_baseQp;
+
+if (m_cfg-param.rc.aqMode || (m_cfg-param.rc.vbvBufferSize 0  
m_cfg-param.rc.vbvMaxBitrate 0))
 {
-int qp = calcQpForCu(m_pic, cuAddr);
+int qp = calcQpForCu(m_pic, cuAddr , qpBase);
 setLambda(qp, row);
-if (qp  MAX_QP)
-qp = MAX_QP;
-cu-setQP(0, (char)qp);
+qp = X265_MIN(qp, MAX_QP);
+cu-setQP(0,char(qp));
+cu-m_baseQp = qpBase;
 }
 codeRow.processCU(cu, m_pic-getSlice(), bufSbac, 
m_cfg-param.bEnableWavefront  col == 1);
+if (m_cfg-param.rc.vbvBufferSize  m_cfg-param.rc.vbvMaxBitrate)
+{
+// Update encoded bits, satdCost, baseQP for each CU
+m_pic-m_rowDiagSatd[row] += m_pic-m_cuCostsForVbv[cuAddr];
+m_pic-m_rowEncodedBits[row] += cu-m_totalBits;
+m_pic-m_numEncodedCusPerRow[row] = cuAddr;
+m_pic-m_qpaAq[row] += cu-getQP(0);
+m_pic-m_qpaRc[row] += cu-m_baseQp;
 
-// TODO: Keep atomic running totals for rate control?
-// cu-m_totalBits;
-// cu-m_totalCost;
-// cu-m_totalDistortion;
+if ((uint32_t)row == col)
+m_pic-m_rowDiagQp[row] = qpBase;
 
+// If current block is at row diagonal checkpoint, call vbv 
ratecontrol.
+if ((uint32_t)row == col  row != 0 )
+{
+ m_top-m_rateControl-rowDiagonalVbvRateControl(m_pic, row, 
m_rce, qpBase);
+ qpBase = Clip3((double)MIN_QP, (double)MAX_MAX_QP, qpBase);
+}
+}
 // Completed CU processing
 m_rows[row].m_completed++;
-
 if (m_rows[row].m_completed = 2  row  m_numRows - 1)
 {
 ScopedLock below(m_rows[row + 1].m_lock);
@@ -1127,38 +1143,43 @@
 m_totalTime = m_totalTime + (x265_mdate() - startTime);
 curRow.m_busy = false;
 }
-
-int FrameEncoder::calcQpForCu(TComPic *pic, uint32_t cuAddr)
+int FrameEncoder::calcQpForCu(TComPic *pic, uint32_t cuAddr, double baseQp)
 {
 x265_emms();
-double qp = pic-getSlice()-m_avgQpRc;
-if (m_cfg-param.rc.aqMode)
+double qp = baseQp;
+
+/* Derive qpOffet for each CU by averaging offsets for all 16x16 blocks in 
the cu. */
+double qp_offset = 0;
+int maxBlockCols = (pic-getPicYuvOrg()-getWidth() + (16 - 1)) / 16;
+int maxBlockRows = (pic-getPicYuvOrg()-getHeight() + (16 - 1)) / 16;
+int noOfBlocks = g_maxCUWidth / 16;
+int block_y = (cuAddr / pic-getPicSym()-getFrameWidthInCU()) * 
noOfBlocks;
+int block_x = (cuAddr * noOfBlocks) - block_y * 
pic-getPicSym()-getFrameWidthInCU();
+
+double *qpoffs = (pic-getSlice()-isReferenced()  
m_cfg-param.rc.cuTree) ? pic-m_lowres.qpOffset : pic-m_lowres.qpAqOffset;
+int cnt = 0, idx =0;
+for (int h = 0; h  noOfBlocks  block_y  maxBlockRows; h++, block_y++)
 {
-/* Derive qpOffet for each CU by averaging offsets for all 16x16 
blocks in the cu. */
-double qp_offset = 0;
-int maxBlockCols = (pic-getPicYuvOrg()-getWidth() + (16 - 1)) / 16;
-int maxBlockRows = (pic-getPicYuvOrg()-getHeight() + (16 - 1)) / 16;
-int noOfBlocks = g_maxCUWidth / 16;
-int block_y = (cuAddr / pic-getPicSym()-getFrameWidthInCU()) * 
noOfBlocks;
-int block_x = (cuAddr * noOfBlocks) - block_y * 
pic-getPicSym()-getFrameWidthInCU();
+for (int w = 0; w  noOfBlocks  (block_x + w)  maxBlockCols; w++)
+{
+idx = block_x + w + (block_y * maxBlockCols);
+if 

[x265] [PATCH] common: validate bframe and maxCUSize for positive values

2014-02-20 Thread sumalatha
# HG changeset patch
# User Sumalatha Polureddy
# Date 1392966000 -19800
# Node ID 00faf694d2feefbe49a8d91ac4afaada93cab53c
# Parent  0c19c44af2d3a8825d804597f1c2f82e32e4d4b7
common: validate bframe and maxCUSize for positive values

diff -r 0c19c44af2d3 -r 00faf694d2fe source/common/common.cpp
--- a/source/common/common.cpp  Fri Feb 21 12:23:22 2014 +0900
+++ b/source/common/common.cpp  Fri Feb 21 12:30:00 2014 +0530
@@ -460,6 +460,10 @@
 {
 #define CHECK(expr, msg) check_failed |= _confirm(param, expr, msg)
 int check_failed = 0; /* abort if there is a fatal configuration problem */
+CHECK(param-maxCUSize  2147483648/*2^31*/,
+  maximum CU size should positive number);
+if (check_failed == 1)
+return check_failed;// return if maxCUSize is negative, since 
maxcusize is used in below code for accessing an array which should be positive
 uint32_t maxCUDepth = (uint32_t)g_convertToBit[param-maxCUSize];
 uint32_t tuQTMaxLog2Size = maxCUDepth + 2 - 1;
 uint32_t tuQTMinLog2Size = 2; //log2(4)
@@ -530,6 +534,8 @@
   RD Level is out of range);
 CHECK(param-bframes  param-lookaheadDepth,
   Lookahead depth must be greater than the max consecutive bframe 
count);
+CHECK(param-bframes  0,
+  bframe count should be greater than zero);
 CHECK(param-bframes  X265_BFRAME_MAX,
   max consecutive bframe count must be 16 or smaller);
 CHECK(param-lookaheadDepth  X265_LOOKAHEAD_MAX,
___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel


[x265] Unclear VUI options

2014-02-20 Thread Mario *LigH* Rohkrämer
The possibly unintended indentation of this line (due to a later inserted  
[no-], I believe) made me wonder what the desired format of this parameter  
is:


--[no-]range — Specify black level and range of luma and chroma signals.  
Default of disabled


I use to know terms like TV range (Luma: 16-235; Chroma: 16-240) and PC  
range (both 0-255). This parameter looks as if to be used like a boolean  
value, which is not really plausible to me.



Furthermore, a few additions from your German Grammar Nazi:

Some parameter descriptions use Default, others Default of. And  
Usability has no e.


--
__

Fun and success!
Mario *LigH* Rohkrämer
mailto:cont...@ligh.de

___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel


Re: [x265] clear m_tempPel only once

2014-02-20 Thread Steve Borho
On Thu, Feb 20, 2014 at 11:57 PM, Satoshi Nakagawa nakagawa...@oki.com wrote:
 # HG changeset patch
 # User Satoshi Nakagawa nakagawa...@oki.com
 # Date 1392962042 -32400
 #  Fri Feb 21 14:54:02 2014 +0900
 # Node ID 3706b57addade77ee6d06bd95600c99e80abb93a
 # Parent  0c19c44af2d3a8825d804597f1c2f82e32e4d4b7
 clear m_tempPel only once

 diff -r 0c19c44af2d3 -r 3706b57addad source/Lib/TLibEncoder/TEncSearch.cpp
 --- a/source/Lib/TLibEncoder/TEncSearch.cpp Fri Feb 21 12:23:22 2014 +0900
 +++ b/source/Lib/TLibEncoder/TEncSearch.cpp Fri Feb 21 14:54:02 2014 +0900
 @@ -79,7 +79,7 @@

  TEncSearch::~TEncSearch()
  {
 -delete [] m_tempPel;
 +X265_FREE(m_tempPel);

I was digging around this same function today and actually changed a
lot of these buffers to X265_MALLOC's pixel or uint8_t arrays; so
unfortunately this collides with those patches.


  if (m_cfg)
  {
 @@ -135,7 +135,8 @@

  initTempBuff(cfg-param.internalCsp);

 -m_tempPel = new Pel[g_maxCUWidth * g_maxCUHeight];
 +m_tempPel = X265_MALLOC(Pel, g_maxCUWidth * g_maxCUHeight);
 +memset(m_tempPel, 0, sizeof(Pel) * g_maxCUWidth * g_maxCUHeight);

hmm; after I push my commits maybe you could move this memset and
rename the buffer to m_zerobuf or something.  I didn't realize it was
only used to measure sum of squares.


  const uint32_t numLayersToAllocate = cfg-getQuadtreeTULog2MaxSize() - 
 cfg-getQuadtreeTULog2MinSize() + 1;
  m_qtTempCoeffY  = new TCoeff*[numLayersToAllocate];
 @@ -3564,8 +3565,6 @@
  const uint32_t numSamplesLuma = 1  (trSizeLog2  1);
  const uint32_t numSamplesChroma = 1  (trSizeCLog2  1);

 -::memset(m_tempPel, 0, sizeof(Pel) * numSamplesLuma); // not 
 necessary needed for inside of recursion (only at the beginning)
 -
  int partSize = partitionFromSizes(trWidth, trHeight);
  uint32_t distY = 
 primitives.sse_sp[partSize](resiYuv-getLumaAddr(absTUPartIdx), 
 resiYuv-m_width, m_tempPel, trWidth);

 ___
 x265-devel mailing list
 x265-devel@videolan.org
 https://mailman.videolan.org/listinfo/x265-devel



-- 
Steve Borho
___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel